Why String intern method? [duplicate] - java

This question already has answers here:
When should we use intern method of String on String literals
(14 answers)
Closed 9 years ago.
As explained in this When should we use intern method of String on String constants post that String literals are automatically pooled but for object constructed using new are not, so for that intern method is used. But even if we use intern method a new object will be created, then what is the use of intern method?
String s = "Example";
String s1 = new String("Example"); // will create new object
String s2 = new String("Example").intern(); // this will create new object
// but as we are calling intern we will get reference of pooled string "Example"
now
System.out.println(s == s1); // will return false
System.out.println(s == s2); // will return true
System.out.println(s1 == s2); // will return false
So what's the use of intern method?
Edit
I understood the working of intern method but my question is why intern method is there? because to call intern method we must create string object using new, which will create new instance of string!
String s3 = new String("Example"); // again new object
String s4 = s3.intern();
System.out.println(s3 == s4); // will return false
So calling intern method will not point s3 to pooled string. intern method will return reference to pooled string.
Also calling intern will push the string into pool if it is not already pooled? So does that mean every time I call intern on any string will be pushed to pool?

The basic algorithm for .intern() is the following:
Create a hash set of Strings
Check to see if the String you're dealing with is already in the set
If so, return the one from the set
Otherwise, add this string to the set and return it
So it basically used to find the given string exist into the pool if it exist then it will get the same instance for that otherwise it creates the new instance for the new String.

Here is the sequence of events:
String s = "Example";
Create a Sting literal in pool
String s1 = new String("Example");
// will create new object <-- Correct, just create a new object
String s2 = new String("Example").intern(); //
Create the object only when String literal 'Example' is not found in the pool. In this case s1 will be retuned.
I hope you see here that intern is actually giving you an option to use String from the pool.
And further in Java all the Strings are Object only; so the pool is actually reference of String having exact char sequence.
I remember a very good thread on stackoverflow itself; just found it for you .. Just check this one, it is awesome Is String Literal Pool a collection of references to the String Object, Or a collection of Objects

This method returns a canonical representation for the string object. It follows that for any two string a and t, s.intern()==t.intern() is true if and only if s.equal (t) is true.
Here is your Syntax:--
public String intern ()

Related

Java Strings - What is the difference between "Java" and new String("Java")? [duplicate]

This question already has answers here:
Difference between string object and string literal [duplicate]
(13 answers)
Closed last year.
Given this example code:
class basic {
public static void main(String[] args) {
String s1 = "Java";
String s2 = new String("Java");
}
}
Are s1 and s2 both reference variables of an object?
Do those two lines of code do the same thing?
Lines 3, 4 don't do the same thing, as:
String s1 = "Java"; may reuse an instance from the string constant pool if one is available, whereas new String("Java"); creates a new and referentially distinct instance of a String object.
Therefore, Lines 3 and 4 don't do the same thing.
Now, lets have a look at the following code:
String s1 = "Java";
String s2 = "Java";
System.out.println(s1 == s2); // true
s2 = new String("Java");
System.out.println(s1 == s2); // false
System.out.println(s1.equals(s2)); // true
== on two reference types is a reference identity comparison. Two objects that are equals are not necessarily ==. Usually, it is wrong to use == on reference types, and most of the time equals need to be used instead.
Initializing String using a new keyword String s2 = new String("Java"); creates a new object in the heap of memory. String initialized through this method is mutable means to say the value of the string can be reassigned after initialization.
Whereas, String s1 = "Java" direct String initialization using the Literal creates an object in the pooled area of memory. String created through Literal doesn’t create a new object. It just passed the reference to the earlier created object.

In Java what is the difference between two String initialization methods? [duplicate]

This question already has answers here:
Literal string creation vs String object creation
(3 answers)
Closed 3 years ago.
String str = "ABC";
String str2 = new String("ABC");
In both the methods if i am looking for hashcode it is giving same hashcode
I saw the explanation in the Toptal questions: https://www.toptal.com/java/interview-questions
"In general, String s = "Test" is more efficient to use than String s = new String("Test").
In the case of String s = "Test", a String with the value “Test” will be created in the String pool. If another String with the same value is then created (e.g., String s2 = "Test"), it will reference this same object in the String pool.
However, if you use String s = new String("Test"), in addition to creating a String with the value “Test” in the String pool, that String object will then be passed to the constructor of the String Object (i.e., new String("Test")) and will create another String object (not in the String pool) with that value. Each such call will therefore create an additional String object (e.g., String s2 = new String("Test") would create an addition String object, rather than just reusing the same String object from the String pool)."
Both expression gives you String object, but there is subtle difference between them. When you create String object using new() operator, it always create a new object in heap memory. On the other hand, if you create object using String literal syntax e.g. "Java", it may return an existing object from String pool (a cache of String object , which is now moved to heap space in recent Java release), if it's already exists. Otherwise it will create a new string object and put in string pool for future re-use.

How does String.intern() work and how does it affect the String pool?

As we know, String().intern() method add String value in string pool if it's not already exist. If exists, it returns the reference of that value/object.
String str = "Cat"; // creates new object in string pool with same character sequence.
String st1 = "Cat"; // has same reference of object in pool, just created in case of 'str'
str == str1 //that's returns true
String test = new String("dog");
test.intern();// what this line of code do behind the scene
I need to know, when i call test.intern() what this intern method will do?
add "dog" with different reference in string pool or add test object reference in string pool(i think, it's not the case )?
I tried this
String test1 = "dog";
test == test1 // returns false
I just want to make sure, when I call test.intern() it creates new object with same value in String pool? now I have 2 objects with value "dog". one exist directly in heap and other is in String pool?
when i call test.intern() what this intern method will do?
It will put the "dog" string in the string pool (unless it's already there). But it will not necessarily put the object that test refers to in the pool. This is why you typically do
test = test.intern();
Note that if you have a "dog" literal in your code, then there will already be a "dog" in the string pool, so test.intern() will return that object.
Perhaps your experiment confuses you, and it was in fact the following experiment you had in mind:
String s1 = "dog"; // "dog" object from string pool
String s2 = new String("dog"); // new "dog" object on heap
System.out.println(s1 == s2); // false
s2 = s2.intern(); // intern() returns the object from string pool
System.out.println(s1 == s2); // true

String s = new String("xyz"). How many objects has been made after this line of code execute?

The commonly agreed answer to this interview question is that two objects are created by the code. But I don't think so; I wrote some code to confirm.
public class StringTest {
public static void main(String[] args) {
String s1 = "a";
String s2 = "a";
String s3 = new String("a");
System.out.println("s1: "+s1.hashCode());
System.out.println("s2: "+s2.hashCode());
System.out.println("s3: "+s3.hashCode());
}
}
The output is:
Does this mean that only one object was created?
Reaffirm: My question is how many object was created by the following code:
String s = new String("xyz")
Instead of the StringTest code.
Inspired by #Don Branson, I debugged the below code:
public class test {
public static void main(String[] args) {
String s = new String("abc");
}
}
And the result is:
The id of s is 84, and the id of "abc" is 82. What exactly does this mean?
THERE ARE ERRORS BELOW DEPENDING ON THE JVM/JRE THAT YOU USE. IT IS BETTER TO NOT WORRY ABOUT THINGS LIKE THIS ANYWAYS. SEE COMMENTS SECTION FOR ANY CORRECTIONS/CONCERNS.
First, this question really asks about this addressed here:
Is String Literal Pool a collection of references to the String Object, Or a collection of Objects
So, that is a guide for everyone on this matter.
...
Given this line of code: String s = new String(“xyz”)
There are two ways of looking at this:
(1) What happens when the line of code executes -- the literal moment it runs in the program?
(2) What is the net effect of how many Objects are created by the statement?
Answer:
1) After this executes, one additional object is created.
a) The "xyz" String is created and interned when the JVM loads the class that this line of code is contained in.
If an "xyz" is already in the intern pool from some other code, then the literal might produce no new String object.
b) When new String s is created, the internal char[] is a copy of the interned"xyz" string.
c) That means, when the line executes, there is only one additional object created.
The fact is the "xyz" object will have been created as soon as the class loaded and before this code section was ever run.
...next scenario ...
2) There are three objects created by the code (including the interned "a")
String s1 = "a";
String s2 = "a";
String s3 = new String("a");
a) s1 and s2 are just referenced,not objects, and they point to the same String in memory.
b) The "a" is interned and is a compound object: one char[] object and the String object itself. It consisting of two objects in memory.
c) s3, new String("a") produces one more object. The new String("a") does not copy the char[] of "a", it only references it internally. Here is the method signature:
public String2(String original) {
this.value = original.value;
this.hash = original.hash;
}
One interned String ("a") equals 2 Objects. And one new String("a") equals one more object. Net effect from code is three objects.
Two objects will be created for this:
String s = new String("abc");
One in the heap and the other in the "string constant pool" (SCP). The reference s will pointing to s always, and GC is not allowed in the SCP area, so all objects on SCP will be destroyed automatically at the time of JVM shutdown.
For example:
Here by using a heap object reference we are getting the corresponding SCP object reference by call of intern()
String s1 = new String("abc");
String s2 = s1.intern(); // SCP object reference
System.out.println(s1==s2); // false
String s3 = "abc";
System.out.println(s2==s3); //True s3 reference to SCP object here
String s = new String("xyz");
The above line will create two object one is in heap and another is in String constant pool.
now if we do this
String s = new String("xyz");
String s1 ="xyz";
the above two statement will create two object.
The first line String s = new String("xyz");will create two object as mentioned
in 1st line and , When String s = "xyz";executes it checks in string constant pool if there is same content object is there or not, since the first line made an entry in string constant pool with "xyz" it returns the same reference and does not create other object.
What if we have these four line together as mentioned bellow.
String s2 = new String("xyz");
String s3 ="xyz";
String s4 = new String("xyz");
String s5 ="xyz";
If we execute the above line we will have three object.
The first and as mentioned will create two object one in heap and
another in String constant poll.
When the second line executes it checks in the string constant poll
and find with "xyz" so it returns the same object, so till second
line we have two objects.
when the third line executes it will create a new object in the heap
since new operator creates object in the heap so till third line will
have 3 objects.
When the fourth line executes it checks in the string constant poll
and find with "xyz" so it returns the same object, so fourth line
we have three objects.
Bonus about the intern() method
When the intern() method is invoked on a String object it looks the
string contained by this String object in the pool, if the string is
found there then the string from the pool is returned. Otherwise, this
String object is added to the pool and a reference to this String
object is returned.
public class TestString {
public static void main(String[] args) {
String s1 = "Test";
String s2 = "Test";
String s3 = new String("Test");
final String s4 = s3.intern();
System.out.println(s1 == s2);
System.out.println(s2 == s3);
System.out.println(s3 == s4);
System.out.println(s1 == s3);
System.out.println(s1 == s4);
System.out.println(s1.equals(s2));
System.out.println(s2.equals(s3));
System.out.println(s3.equals(s4));
System.out.println(s1.equals(s4));
System.out.println(s1.equals(s3));
}
}
//Output
true
false
false
false
true
true
true
true
true
true
See the magic of intern by applying intern method on new string object.
intern is applied here so it will check if "Test" is available in String Constant pool or not since "Test" is available in String constant pool and it will return the same object so s3 has the same reference as s1 and s2 and will get all the result as true
public class TestString {
public static void main(String[] args) {
String s1 = "Test";
String s2 = "Test";
String s3 = new String("Test").intern();
final String s4 = s3.intern();
System.out.println(s1 == s2);
System.out.println(s2 == s3);
System.out.println(s3 == s4);
System.out.println(s1 == s3);
System.out.println(s1 == s4);
System.out.println(s1.equals(s2));
System.out.println(s2.equals(s3));
System.out.println(s3.equals(s4));
System.out.println(s1.equals(s4));
System.out.println(s1.equals(s3));
}
}
true
true
true
true
true
true
true
true
true
true
There are two ways to create string objects in Java:
Using the new operator, i.e.
String s1 = new String("abc");
Using a string literal, i.e.
String s2 = "abc";
Now string allocation is costly in both time and memory so the JVM (Java Virtual Machine) performs some tasks. WHAT TASKS?
See, whenever you are using the new operator the object is created, and the JVM will not look in the string pool. It is just going to create the object, but when you are using the string literals for creating string objects then the JVM will perform the task of looking in the string pool
I.e., when you write
String s2 = "abc";
the JVM will look in the string pool and check if "abc" already exists or not. If it exists then a reference is returned to the already existing string "abc" and a new object is not created and if it doesn't exists then an object is created.
So in your case
(a)
String s1 = new String("abc");
Since new is used the object is created
(b)
String s2 = "abc";
using a string literal an object is created and "abc" is not in the
string pool and therefore the object is created.
(c)
String s2 = "abc";
Again using a string literal and "abc" is in the string pool, and
therefore the object is not created.
You can also check it out by using the following code:
class String_Check
{
public static void main(String[] n)
{
String s1 = new String("abc");
String s2 = "abc";
String s3 = "abc";
if (s1==s2)
System.out.println("s1==s2");
if(s1==s3)
System.out.println("s1==s3");
if(s2==s3)
System.out.println("s2==s3");
}
}
I hope this helps... Note that == is used to see if the objects are equal and the equals(Object) method is used to see if content are equal.
If we execute String s = new String("Brajesh"); , two objects shall be created. One object will be created in string literal pool and another one in heap area.
But if we have already same string literal object, then only one object is created.
like
String s1 ="Brajesh";
String s = new String("Brajesh");//it will create only one object in heap area
Apart from this one additional object is also created in heap area that is char[]'s object. I have attached here snapshot of heap memory.
There are so many random answers and so I am confident that my interviewer will also be not very sure :) :)
I researched a lot and found that hashcode is not the memory address and the variables while debugging don't give the memory address. So, those parameters might confuse.
2 or 3 objects are created, depending on how smart the compiler is.
Nevertheless, your test is junk, because hashCode of Strings is based on the content of the String, and not on their identity. If you want to check for identity, you should use System.identityHashCode or just == comparison.
The compiler and the runtime are allowed (not forced) to optimize string creation whenever possible. So, they optimize literal strings, by using a single literal for the three strings you have.
Anyway, the new operator must return a new object (i.e. a newly allocated one).
String optimization at runtime is possible if the static method String.valueOf is used instead. But I don't know if any caching is actually applied by current JREs (maybe it's more expensive to check a hash table than to just allocate a new String)
String s1="Pune";
String s2="Mumbai";
String s3="Pune";
String s4=new String("Mumbai");
System.out.println("S1 :"+s1.hashCode()); //S1 :2499228
System.out.println("S2 :"+s2.hashCode()); //S2 :-1979126203
System.out.println("S3 :"+s3.hashCode()); //S3 :2499228
System.out.println("S4 :"+s4.hashCode()); //S4 :-1979126203
System.out.println(s2==s4); // false
As we can see in the above program we are getting a similar hashcode for s2 and s4 respectively although we are getting false using == operator. == operator is used for reference comparison.
Two objects have been created at "String s4=new String("Mumbai")", one in heap memory and one in stack memory. Therefore s2 compares with s4 which is created in heap memory, not with stack memory.
public String(String original) {
int size = original.count;
char[] originalValue = original.value;
char[] v;
if (originalValue.length > size) {
// The array representing the String is bigger than the new
// String itself. Perhaps this constructor is being called
// in order to trim the baggage, so make a copy of the array.
int off = original.offset;
v = Arrays.copyOfRange(originalValue, off, off+size);
} else {
// The array representing the String is the same
// size as the String, so no point in making a copy.
v = originalValue;
}
this.offset = 0;
this.count = size;
this.value = v;
}
If we see the code , we can see it will just create a char[] and it will get copied every time with same content get instantiated and yes it will store data in String Constant Pool . 1)Will take from SCP String s1 = "a" String s2 = "a"; 2)Creates a new object String s3 = new String("a"); Curiosity , New Object String s2=new String("a"); In all above code same char[] will get copied.i:e char[] value You can check here
I ran it in the Eclipse debugger. In that context, two objects are created, one with the id 17, the other 22:
java.lang.String overrides the hashCode() method so that the value depends on the content of the string.
As a result, hashCode() does not tell you anything about the number of instances. It may be the same string or may be another instance with no single byte shared. Same about equals(). This explains your output.
Use System.identityHashCode(..) for this kind of research.
And may the source be with you.
#Giulio, You are right.
String s3 = new String("abc"); creates two objects one in heap with reference s3 and another in SCP(Without reference).
and now String s2 = "abc"; doesn't create any new object in SCP because "abc" is already there in SCP.
String s1 = "abc";
String s2 = "abc";
String s3 = new String("abc");
String s4 = s3.intern();
System.out.println("s1: "+System.identityHashCode(s1));
System.out.println("s2: "+System.identityHashCode(s2));
System.out.println("s3: "+System.identityHashCode(s3));
System.out.println("s4: "+System.identityHashCode(s4));
O/P:s1: 366712642,
s2: 366712642,
s3: 1829164700,
s4: 366712642
As i am not eligible for commenting i wrote it here.
If we run below code in eclipse in debug mode we'll get an idea about how many objects are created with String string = new String("manoj"); Internally it will create String str = "manoj"in String class constructor.
Just check id after hover on reference as shown in below screen shot.
ScreenShot
public static void main(String[] args)
{
String str = "atul";
String string = new String("manoj");
String string2 = "manoj";
System.out.println(str == string);
}
Confused with what exactly happens after the new String("<>") is being called, I found this thread. Your hashcode comparison understanding is not technically correct though.
int hashCode() has been overriden in String class and it returns a value depending on the content of the String literal.
String s1 = new String("Hello");
String s2 = new String("Hello");
So s1.hashCode() = s2.hashCode() = anyStringOfContent_"Hello".hashCode()
**/** Cache the hash code for the string */
private int hash; // Default to 0
public int hashCode() {
int h = hash;
if (h == 0 && value.length > 0) {
char val[] = value;
for (int i = 0; i < value.length; i++) {
**h = 31 * h + val[i];**
}
hash = h;
}
return h;
}**
Now just to explain why is this done, you can actually read the Kathy Sierra book which has a great explanation why developers have done in this manner (basically any objects returning true to equals() method should return same hashCode() value).
If new String() creates 2 objects (one in heap and one in String pool) then what is the use of .intern method ?
intern() method invoked on a String object looks for the string
contained by this String object in the pool, if the string is found
there then the string from the pool is returned. Otherwise, this
String object is added to the pool and a reference to this String
object is returned.
2 objects made out of:
String s = new String("xyz");
1st creating new String object in heap memory (String Pool)
2nd placing "xyz" in string constant pool
There is a concept called string pool in Java. A string pool (string intern pool) is a special storage area in the Java heap. When a string is created and if the string already exists in the pool, the reference of the existing string will be returned, instead of creating a new object and returning its reference.
So String s = new String(“xyz”) it will create two objects.
The first object will be created in the Java permanent heap memory as part of the argument we are passing - "XYZ". And it will be created in the String Literal Pool.
The second object will be created within the Java heap memory - which will be created as part of the new operator.
Just because all your hash codes are the same does not mean that you are looking at the same object. Two objects are created. Let's break this down.
String s = new String(“xyz”);
In the part ' new String("xyz") ', an address is returned to the new string "xyz". When you say ' String s = ', this assigns that returned address to this object, so that they point to the same place, but the new string and string s are two seperate objects.
I used the hashcode() method to find the number of string objects created.
The hashcode() method digests the data stored in the reference variable into a single hash value.
CASE1:
String s="
Fred";
System.out.println(s.hashCode());
s=s+"47";
System.out.println(s.hashCode());
s=s.substring(2,5);
System.out.println(s.hashCode());
s=s.toUpperCase();
System.out.println(s.hashCode());
s=s.toString();
System.out.println(s.hashCode());
The output is:
Fred--2198155 //1st object ---------------- String s="Fred"
Fred47--2112428622 //2nd object ---------------- s=s+"47"
ed4--100213 //3rd object ---------------- s=s.substring(2,5)
ED4--68469 //4th object ---------------- s=s.toUpperCase()
ED4--68469 //this is retrieved from the string constant pool -------- s=s.toString();
So 4 objects created in total.
CASE 2:
String s="FRED";
System.out.println(s.hashCode());
s=s+"47";
System.out.println(s.hashCode());
s=s.substring(2,5);
System.out.println(s.hashCode());
s=s.toUpperCase();
System.out.println(s.hashCode());
s=s.toString();
System.out.println(s.hashCode());
The output is:
FRED--2166379 //1st object ---------------- String s="Fred"
FRED47--2081891886 //2nd object ---------------- s=s+"47"
ED4--68469 //3rd object ---------------- s=s.substring(2,5)
ED4--68469 //this is retrieved from the string constant pool ------- s=s.toUpperCase()
ED4--68469 //this is retrieved from the string constant pool -------- s=s.toString()
3 objects created in total.
There is a way to find how many objects are created using the new keyword (String s1=new String("Rajesh")).
public class Rajesh {
public static void main(String[] args){
String s1=new String("Rajesh");
System.out.println(s1+s1.intern());
}
}
Output:
RajeshRajesh //s1=Rajesh+s2.intern()=Rajesh
Note: As we know the intern method always hit the string constant pool of heap memory.
String s = new String("xyz");
how many objects has been created in above code?
Only one object has been created in above code, that's in heap memory.
not two object.....
If two objects are created, one is in a heap memory(new operator) and another one is in String constant pool(string literal), if your store below value using String literal ,
String s1 = "xyz";
it will not returns reference of object s in string constant pool. it will create new object in String Constant Pool as s1.
How?
we can check it by using == operator (s == s1) to check the reference type.
If s is already stored in String Constant Pool it give the true, in this case output is false.
So the conclusion is one object is created in above code.

String object creation using new and its comparison with intern method

I read in Kathy Sierra book that when we create String using new operator like String s = new String("abc") In this case, because we used the new keyword, Java will create a new String object in normal (nonpool) memory, and s will refer to it. In addition, literal "abc" will be placed in the pool.
intern() says that if String pool already contains a string then the string from the pool is returned Otherwise, the String object is added to the pool and a reference to this String object is returned.
If string "abc" when created using new also placed the string in the pool, then wht does intern() says that string from the pool is returned if String pool contains the string otherwise the string object is added to the pool.
Also I want to know if we create a String using new then actually how many objects get created?
TL;DR: If you ever really need to do new String("abc"), you'll know you need to and you'll know why. It's so rare that it's almost valid to say you never need to. Just use "abc".
The long version:
When you have the code new String("abc") the following things occur at various times:
When the class containing that code is loaded, if a string with the characters "abc" is not already in the intern pool, it's created and put there.
When the new String("abc") code is run:
A reference to the "abc" string from the intern pool is passed into the String constructor.
A new String object is created and initialized by copying the characters from the String passed into the constructor.
The new String object is returned to you.
If string "abc" when created using new also placed the string in the pool, then why does intern() says that string from the pool is returned if String pool contains the string otherwise the string object is added to the pool.
Because that's what intern does. Note that calling intern on a string literal is a no-op; string literals are all interned automatically. E.g.:
String s1 = "abc"; // Get a reference to the string defined by the literal
String s2 = s1.intern(); // No-op
System.out.println(s1 == s2); // "true"
System.out.println(s1 == "abc"); // "true", all literals are interned automatically
Also I want to know if we create a String using new then actually how many objects get created?
You create at least one String object (the new, non-interned one), and possibly two (if the literal wasn't already in the pool; but again, that bit happens earlier, when the class file's literals are loaded):
String s1 = "abc"; // Get a reference to the string defined by the literal
String s2 = new String(s1); // Create a new `String` object (guaranteed)
System.out.println(s1 == s2); // "false"
String s3 = s2.intern(); // Get the interned version of the string with these characters
System.out.println(s1 == s3); // "true"
String Pool is a pool of string references. Objects are created in Heap only.
When using new String("abc").intern() or using method like String s = "abc"; String pool is checked if there is an reference existing which refers to "abc".
In case reference for "abc" already exists in pool and .intern() is called on the reference referencing to an String object created using new String("abc"), then object created by new String("abc") is eligible for garbage collection. See below code for more clarity.
public static void main(String[] args) {
String s = new String("abc");
String a = s;
System.out.println(s==a);// true
String b = "abc";
s = s.intern();
System.out.println(s==a);// false
}

Categories

Resources