one thing that i always wondered, if i have a method like this:
String replaceStuff (String plainText) {
return plainText.replaceAll("&", "&");
}
will it create new String objects all the time for the "&" and the "&" that gets destroyed by the GC and then recreated again by next call?
E.g.
would it in theory be better to do something like this
final String A ="&";
final String AMP ="&";
String replaceStuff (String plainText) {
return plainText.replaceAll(A, AMP);
}
i think this is probably a more theoretic question than a real life problem, I am just curious how the memory management is handled in this aspect.
No. String literals are interned. Even if you use an equal literal (or other constant) from elsewhere, you'll still refer to the same object:
Object x = "hello";
Object y = "he" + "llo";
System.out.println(x == y); // Guaranteed to print true.
EDIT: The JLS guarantees this in section 3.10.5
String literals-or, more generally, strings that are the values of constant expressions (§15.28)-are "interned" so as to share unique instances, using the method String.intern.
Section 15.28 shows the + operator being included as an operation which can produce a new constant from two other constants.
Nope, they're literals and therefore automatically interned to the constant pool.
The only way you'd create new strings each time would be to do:
String replaceStuff (String plainText) {
return plainText.replaceAll(new String("&"), new String("&"));
}
Strings are handled little different than the normal objects by GC.
For example if
String a = "aaa";
String a1 = "aaa";
Now both a and a1 will point to same String value in memory till any of the value changes. Hence there will be only 1 object in memory.
Also, if we change 'a' and 'a1' to point to any other string, still the value "aaa" is left in the string pool and will be used later by JVM if required. The string is not GC'd
Related
This question already has answers here:
Is a Java string really immutable?
(16 answers)
Closed 7 years ago.
Java string pool coupled with reflection can produce some unimaginable result in Java:
import java.lang.reflect.Field;
class MessingWithString {
public static void main (String[] args) {
String str = "Mario";
toLuigi(str);
System.out.println(str + " " + "Mario");
}
public static void toLuigi(String original) {
try {
Field stringValue = String.class.getDeclaredField("value");
stringValue.setAccessible(true);
stringValue.set(original, "Luigi".toCharArray());
} catch (Exception ex) {
// Ignore exceptions
}
}
}
Above code will print:
"Luigi Luigi"
What happened to Mario?
What happened to Mario ??
You changed it, basically. Yes, with reflection you can violate the immutability of strings... and due to string interning, that means any use of "Mario" (other than in a larger string constant expression, which would have been resolved at compile-time) will end up as "Luigi" in the rest of the program.
This kinds of thing is why reflection requires security permissions...
Note that the expression str + " " + "Mario" does not perform any compile-time concatenation, due to the left-associativity of +. It's effectively (str + " ") + "Mario", which is why you still see Luigi Luigi. If you change the code to:
System.out.println(str + (" " + "Mario"));
... then you'll see Luigi Mario as the compiler will have interned " Mario" to a different string to "Mario".
It was set to Luigi. Strings in Java are immutable; thus, the compiler can interpret all mentions of "Mario" as references to the same String constant pool item (roughly, "memory location"). You used reflection to change that item; so all "Mario" in your code are now as if you wrote "Luigi".
To explain the existing answers a bit more, let's take a look at your generated byte code (Only the main() method here).
Now, any changes to the content's of that location will affect both the references (And any other you give too).
String literals are stored in the string pool and their canonical value is used. Both "Mario" literals aren't just strings with the same value, they are the same object. Manipulating one of them (using reflection) will modify "both" of them, as they are just two references to the same object.
You just changed the String of String constant pool Mario to Luigi which was referenced by multiple Strings, so every referencing literal Mario is now Luigi.
Field stringValue = String.class.getDeclaredField("value");
You have fetched the char[] named value field from class String
stringValue.setAccessible(true);
Make it accessible.
stringValue.set(original, "Luigi".toCharArray());
You changed original String field to Luigi. But original is Mario the String literal and literal belongs to the String pool and all are interned. Which means all the literals which has same content refers to the same memory address.
String a = "Mario";//Created in String pool
String b = "Mario";//Refers to the same Mario of String pool
a == b//TRUE
//You changed 'a' to Luigi and 'b' don't know that
//'a' has been internally changed and
//'b' still refers to the same address.
Basically you have changed the Mario of String pool which got reflected in all the referencing fields. If you create String Object (i.e. new String("Mario")) instead of literal you will not face this behavior because than you will have two different Marios .
The other answers adequately explain what's going on. I just wanted to add the point that this only works if there is no security manager installed. When running code from the command line by default there is not, and you can do things like this. However in an environment where trusted code is mixed with untrusted code, such as an application server in a production environment or an applet sandbox in a browser, there would typically be a security manager present and you would not be allowed these kinds of shenanigans, so this is less of a terrible security hole as it seems.
Another related point: you can make use of the constant pool to improve the performance of string comparisons in some circumstances, by using the String.intern() method.
That method returns the instance of String with the same contents as the String on which it is invoked from the String constants pool, adding it it if is not yet present. In other words, after using intern(), all Strings with the same contents are guaranteed to be the same String instance as each other and as any String constants with those contents, meaning you can then use the equals operator (==) on them.
This is just an example which is not very useful on its own, but it illustrates the point:
class Key {
Key(String keyComponent) {
this.keyComponent = keyComponent.intern();
}
public boolean equals(Object o) {
// String comparison using the equals operator allowed due to the
// intern() in the constructor, which guarantees that all values
// of keyComponent with the same content will refer to the same
// instance of String:
return (o instanceof Key) && (keyComponent == ((Key) o).keyComponent);
}
public int hashCode() {
return keyComponent.hashCode();
}
boolean isSpecialCase() {
// String comparison using equals operator valid due to use of
// intern() in constructor, which guarantees that any keyComponent
// with the same contents as the SPECIAL_CASE constant will
// refer to the same instance of String:
return keyComponent == SPECIAL_CASE;
}
private final String keyComponent;
private static final String SPECIAL_CASE = "SpecialCase";
}
This little trick isn't worth designing your code around, but it is worth keeping in mind for the day when you notice a little more speed could be eked out of some bit of performance sensitive code by using the == operator on a string with judicious use of intern().
I know there are two ways of creating String in Java:
String a = "aaa";
String b = new String("bbb");
With the first way Java will definitely create a String object in the string pool and make a refer to it. (Assume "aaa" wan't in the pool before.)
With the second method, an object will be created in the heap, but will jvm also create an object in the string pool?
In this post Questions about Java's String pool, #Jesper said:
If you do this:
String s = new String("abc");
then there will be one String object in the pool, the one that represents the literal "abc", > and there will be a separate String object, not in the pool, that contains a copy of the > content of the pooled object.
If that's true, then every time with the new String("bbb");, a object "bbb" is created in the pool, which means by either way above, java will always create a string object in the pool. Then what is intern() used for ? In the docs http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#intern(), it says:
When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
That means there are cases that a string is not in the pool, is that possible ? Which one is true ?
As you know that String is an immutable object in Java programming language, which means once constructed can not be altered. Due to this, JVM has the ability to maintain a literal pool which is helpful to reduce the memory usage and to increase the performance. Each time when a String literal is used JVM checks the literal pool. If the literal is already available, the same reference would be returned. If the literal is not available, a new String object will be created and added in the literal pool.
This theory is applied when you try to create a String like a primitive or a literal/constant.
String str = "bbb";
But when you create a new String object
String str = new String("bbb");
the above mentioned rules are overridden and a new instance is created always.
But the intern API in the String class can be used to pick the String reference from the literal pool even though you create a String using new operator. Please check the below given example. Although the str3 is created using new operator since we used the intern method JVM picked up the reference from the literal pool.
public class StringInternExample {
public static void main(final String args[]) {
final String str = "bbb";
final String str1 = "bbb";
final String str2 = new String("bbb");
final String str3 = new String("bbb").intern();
System.out.println("str == str1 : "+(str == str1));
System.out.println("str == str2 : "+(str == str2));
System.out.println("str == str3 : "+(str == str3));
}
}
Output of above code:
str == str1 : true
str == str2 : false
str == str3 : true
You can have a look: Confusion on string immutability
Source of answer: http://ourownjava.com/java/java-string-immutability-and-intern-method/
Shishir
There are essentially two ways that our String objects can enter in to the pool:
Using a literal in source code like "bbb".
Using intern.
intern is for when you have a String that's not otherwise from the pool. For example:
String bb = "bbb".substring(1); // substring creates a new object
System.out.println(bb == "bb"); // false
System.out.println(bb.intern() == "bb"); // true
Or slightly different:
System.out.println(new String("bbb").intern() == "bbb"); // true
new String("bbb") does create two objects...
String fromLiteral = "bbb"; // in pool
String fromNewString = new String(fromLiteral); // not in pool
...but it's more like a special case. It creates two objects because "bbb" refers to an object:
A string literal is a reference to an instance of class String [...].
Moreover, a string literal always refers to the same instance of class String.
And new String(...) creates a copy of it.
However, there are many ways String objects are created without using a literal, such as:
All the String methods that perform some kind of mutation. (substring, split, replace, etc.)
Reading a String from some kind of input such as a Scanner or Reader.
Concatenation when at least one operand is not a compile-time constant.
intern lets you add them to the pool or retrieve an existing object if there was one. Under most circumstances interning Strings is unnecessary but it can be used as an optimization because:
It lets you compare with ==.
It can save memory because duplicates can be garbage collected.
Yes, new String("abc") will create a new object in memory, and thus it is advised to avoid it. Please have a look at item 5 of Josh Bloch's Effective Java, "Avoid creating unnecessary objects" where it is better explained:
As an extreme example of what not to do, consider this statement:
String s = new String("stringette"); // DON'T DO THIS!
The statement
creates a new String instance each time it is executed, and none of
those object creations is necessary. The argument to the String
constructor ("stringette") is itself a String instance, functionally
identical to all of the objects created by the constructor. If this
usage occurs in a loop or in a frequently invoked method, millions of
String instances can be created needlessly. The improved version is
simply the following:
String s = "stringette";
This version uses a
single String instance, rather than creating a new one each time it is
executed. Furthermore, it is guaranteed that the object will be reused
by any other code running in the same virtual machine that happens to
contain the same string literal [JLS, 3.10.5].
http://uet.vnu.edu.vn/~chauttm/e-books/java/Effective.Java.2nd.Edition.May.2008.3000th.Release.pdf
With the second method, an object will be created in the heap, but will jvm also create an object in the string pool?
Yes, but it is the string literal "bbb" which ensures the interned string1. The string constructor creates a new string object which is a copy with the same length and content - the newly created string is not automatically interned.
If that's true, then every time with the new String("bbb");, a object "bbb" is created in the pool, which means by either way above, java will always create a string object in the pool. Then what is intern() used for ?
Only string literals are automatically interned. Other string objects must be manually interned, if such is the desired behavior.
That means there are cases that a string is not in the pool, is that possible ?
With the exception of manual calls to String.intern, only string literals result in interned strings.
While I would recommend using a specialized collection for such cases, interning may be useful where it can be used to avoid creating extra duplicate objects. Some use-cases where interning can be beneficial - as in, the same string value can appear many times - is in JSON keys and XML element/attribute names.
1 This is trivial to reason, consider:
String _b = "bbb"; // string from string literal (this is interned)
String b = new String(_b); // create a NEW string via "copy constructor"
b == _b // -> false (new did NOT return an interned string)
b.equals(_b) // -> true (but it did return an equivalent string)
b.intern() == _b // -> true (which interns to .. the same string object)
I recall seeing a couple of string-intensive programs that do a lot of string comparison but relatively few string manipulation, and that have used a separate table to map strings to identifiers for efficient equality and lower memory footprint, e.g.:
public class Name {
public static Map<String, Name> names = new SomeMap<String, Name>();
public static Name from(String s) {
Name n = names.get(s);
if (n == null) {
n = new Name(s);
names.put(s, n);
}
return n;
}
private final String str;
private Name(String str) { this.str = str; }
#Override public String toString() { return str; }
// equals() and hashCode() are not overridden!
}
I'm pretty sure one of these programs was javac from OpenJDK, so not some toy application. Of course the actual class was more complex (and also I think it implemented CharSequence), but you get the idea - the entire program was littered with Name in any location you would expect String, and on the rare cases where string manipulation was needed, it converted to strings and then cached them again, conceptually like:
Name newName = Name.from(name.toString().substring(5));
I think I understand the point of this - especially when there are a lot of identical strings all around and a lot of comparisons - but couldn't the same be achieved by just using regular strings and interning them? The documentation for String.intern() explicitly says:
...
When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
It follows that for any two strings s and t, s.intern() == t.intern() is true if and only if s.equals(t) is true.
...
So, what are the advantages and disadvantages of manually managing a Name-like class vs using intern()?
What I've thought about so far was:
Manually managing the map means using regular heap, intern() uses the permgen.
When manually managing the map you enjoy type-checking that can verify something is a Name, while an interned string and a non-interned string share the same type so it's possible to forget interning in some places.
Relying on intern() means reusing an existing, optimized, tried-and-tested mechanism without coding any extra classes.
Manually managing the map results in a code more confusing to new users, and strign operations become more cumbersome.
... but I feel like I'm missing something else here.
Unfortunately, String.intern() can be slower than a simple synchronized HashMap. It doesn't need to be so slow, but as of today in Oracle's JDK, it is slow (probably due to JNI)
Another thing to consider: you are writing a parser; you collected some chars in a char[], and you need to make a String out of them. Since the string is probably common and can be shared, we'd like to use a pool.
String.intern() uses such a pool; yet to look up, you'll need a String to begin with. So we need to new String(char[],offset,length) first.
We can avoid that overhead in a custom pool, where lookup can be done directly based on a char[],offset,length. For example, the pool is a trie. The string most likely is in the pool, so we'll get the String without any memory allocation.
If we don't want to write our own pool, but use the good old HashMap, we'll still need to create a key object that wraps char[],offset,length (something like CharSequence). This is still cheaper than a new String, since we don't copy chars.
I would always go with the Map because intern() has to do a (probably linear) search inside the internal String's pool of strings. If you do that quite often it is not as efficient as Map - Map is made for fast search.
what are the advantages and disadvantages of manually managing a Name-like class vs using intern()
Type checking is a major concern, but invariant preservation is also a significant concern.
Adding a simple check to the Name constructor
Name(String s) {
if (!isValidName(s)) { throw new IllegalArgumentException(s); }
...
}
can ensure* that there exist no Name instances corresponding to invalid names like "12#blue,," which means that methods that take Names as arguments and that consume Names returned by other methods don't need to worry about where invalid Names might creep in.
To generalize this argument, imagine your code is a castle with walls designed to protect it from invalid inputs. You want some inputs to get through so you install gates with guards that check inputs as they come through. The Name constructor is an example of a guard.
The difference between String and Name is that Strings can't be guarded against. Any piece of code, malicious or naive, inside or outside the perimeter, can create any string value. Buggy String manipulation code is analogous to a zombie outbreak inside the castle. The guards can't protect the invariants because the zombies don't need to get past them. The zombies just spread and corrupt data as they go.
That a value "is a" String satisfies fewer useful invariants than that a value "is a" Name.
See stringly typed for another way to look at the same topic.
* - usual caveat re deserializing of Serializable allowing bypass of constructor.
String.intern() in Java 5.0 & 6 uses the perm gen space which usually has a low maximum size. It can mean you run out of space even though there is plenty of free heap.
Java 7 uses its the regular heap to store intern()ed Strings.
String comparison it pretty fast and I don't imagine there is much advantage in cutting comparison times when you consider the overhead.
Another reason this might be done is if there are many duplicate strings. If there is enough duplication, this can save a lot of memory.
A simpler way to cache Strings is to use a LRU cache like LinkedHashMap
private static final int MAX_SIZE = 10000;
private static final Map<String, String> STRING_CACHE = new LinkedHashMap<String, String>(MAX_SIZE*10/7, 0.70f, true) {
#Override
protected boolean removeEldestEntry(Map.Entry<String, String> eldest) {
return size() > 10000;
}
};
public static String intern(String s) {
// s2 is a String equals to s, or null if its not there.
String s2 = STRING_CACHE.get(s);
if (s2 == null) {
// put the string in the map if its not there already.
s2 = s;
STRING_CACHE.put(s2,s2);
}
return s2;
}
Here is an example of how it works.
public static void main(String... args) {
String lo = "lo";
for (int i = 0; i < 10; i++) {
String a = "hel" + lo + " " + (i & 1);
String b = intern(a);
System.out.println("String \"" + a + "\" has an id of "
+ Integer.toHexString(System.identityHashCode(a))
+ " after interning is has an id of "
+ Integer.toHexString(System.identityHashCode(b))
);
}
System.out.println("The cache contains "+STRING_CACHE);
}
prints
String "hello 0" has an id of 237360be after interning is has an id of 237360be
String "hello 1" has an id of 5736ab79 after interning is has an id of 5736ab79
String "hello 0" has an id of 38b72ce1 after interning is has an id of 237360be
String "hello 1" has an id of 64a06824 after interning is has an id of 5736ab79
String "hello 0" has an id of 115d533d after interning is has an id of 237360be
String "hello 1" has an id of 603d2b3 after interning is has an id of 5736ab79
String "hello 0" has an id of 64fde8da after interning is has an id of 237360be
String "hello 1" has an id of 59c27402 after interning is has an id of 5736ab79
String "hello 0" has an id of 6d4e5d57 after interning is has an id of 237360be
String "hello 1" has an id of 2a36bb87 after interning is has an id of 5736ab79
The cache contains {hello 0=hello 0, hello 1=hello 1}
This ensure the cache of intern()ed Strings will be limited in number.
A faster but less effective way is to use a fixed array.
private static final int MAX_SIZE = 10191;
private static final String[] STRING_CACHE = new String[MAX_SIZE];
public static String intern(String s) {
int hash = (s.hashCode() & 0x7FFFFFFF) % MAX_SIZE;
String s2 = STRING_CACHE[hash];
if (!s.equals(s2))
STRING_CACHE[hash] = s2 = s;
return s2;
}
The test above works the same except you need
System.out.println("The cache contains "+ new HashSet<String>(Arrays.asList(STRING_CACHE)));
to print out the contents which shows the following include on null for the empty entries.
The cache contains [null, hello 1, hello 0]
The advantage of this approach is speed and that it can be safely used by multiple thread without locking. i.e. it doesn't matter if different threads have different view of STRING_CACHE.
So, what are the advantages and disadvantages of manually managing a
Name-like class vs using intern()?
One advantage is:
It follows that for any two strings s and t, s.intern() == t.intern()
is true if and only if s.equals(t) is true.
In a program where many many small strings must be compared often, this may pay off.
Also, it saves space in the end. Consider a source program that uses names like AbstractSyntaxTreeNodeItemFactorySerializer quite often. With intern(), this string will be stored once and that is it. Everything else if just references to that, but the references you have anyway.
i was working on the basic java program and i found verry funny thing which i am sharing with you. foo() gives output (s==s1) = false and bar gives (s==s1) = true.
I want to know why this happens.
public class StringTest
{
public static void main(String[] args){
foo();
bar();
}
public static void foo(){
String s = "str4";
String s1 = "str" + s.length();
System.out.println("(s==s1) = " + (s1==s));
}
public static void bar(){
String s = "str4";
String s1 = "str" + "4";
System.out.println("(s==s1) = " + (s1==s));
}
}
In the latter case, the compiler optimizes the string concatenation. As this can be done at compile time, both reference the same constant string object.
In the former case, the length() call can't be optimized during compile time. At runtime, a new string object is created, which is not identical to the string constant (but equal to it)
The string catenation in bar() can be done at compile time, because it's an expression composed of nothing but compile-time constants. Although the length of the String s is obviously known at compile time, the compiler doesn't know that length() returns that known value, so it won't be used as a constant.
When you write a line of code like this:
String s1 = "str" + "4";
then the compiler is smart enough to optimize this to:
String s1 = "str4";
Literal strings in Java are managed in a string pool. When you have two literal strings that have the same content (such as s and s1 in your second example), then just one String object will be created which will be shared by the two variables.
The == operator in Java checks if two variables refer to the same object. Since there is only one String object in the second example, s == s1 will be true.
String s1 = "str" + s.length();
String s1 = "str" + "4";
In first case s.length() will return a value of type int, In second case The type is String
Even though the number is 4 in both the cases but types are not the same :)
It probably has to do with the fact that foo() is probably creating an new String instance in s.length()(.toString()), where as bar() is just concatenating a constant. I don't know the nitty gritty of it, but my gut tells me it in that direction
If I needed to guess I would say that the java compiler performs some optimization onto bar(). At compiletime it is clear that "str" + "4" can be replaced by "str4" which (since Strings are immutable objects) is indeed the very same object as "str4"-String used for the s-initialization.
Within foo() the optimization is not that streight forward. In general the value s1-variable cannot be predicted very easily (indeed this example is quite streight forward). So the java compiler will produce two different variables for s and s1.
The "==" operator does not compare the value of the Strings! It checks whether these are the same Objects. To compare the values of the Strings use the "equals" method like this:
String s = "str4";
String s1 = "str" + s.length();
System.out.println("(s==s1) = " + (s1.equals(s2));
You should try playing with intern method of String class. Java keeps something like dictionary where all different strings are stored. When you create a string object which can be evaluated at compile time, Java searches it in its dictionary. If it founds the string, it stores only a reference to this string (which is actually returned by intern method).
You should notice that:
"str4" == ("str" + "str4".length()) returns false, but
"str4" == ("str" + "str4".length()).intern() returns true, because the only "wrapper" is a different object.
This question already has answers here:
What is the Java string pool and how is "s" different from new String("s")? [duplicate]
(5 answers)
Closed 9 years ago.
I am confused about StringPool in Java. I came across this while reading the String chapter in Java. Please help me understand, in layman terms, what StringPool actually does.
This prints true (even though we don't use equals method: correct way to compare strings)
String s = "a" + "bc";
String t = "ab" + "c";
System.out.println(s == t);
When compiler optimizes your string literals, it sees that both s and t have same value and thus you need only one string object. It's safe because String is immutable in Java.
As result, both s and t point to the same object and some little memory saved.
Name 'string pool' comes from the idea that all already defined string are stored in some 'pool' and before creating new String object compiler checks if such string is already defined.
I don't think it actually does much, it looks like it's just a cache for string literals. If you have multiple Strings who's values are the same, they'll all point to the same string literal in the string pool.
String s1 = "Arul"; //case 1
String s2 = "Arul"; //case 2
In case 1, literal s1 is created newly and kept in the pool. But in case 2, literal s2 refer the s1, it will not create new one instead.
if(s1 == s2) System.out.println("equal"); //Prints equal.
String n1 = new String("Arul");
String n2 = new String("Arul");
if(n1 == n2) System.out.println("equal"); //No output.
http://p2p.wrox.com/java-espanol/29312-string-pooling.html
Let's start with a quote from the virtual machine spec:
Loading of a class or interface that contains a String literal may create a new String object (§2.4.8) to represent that literal. This may not occur if the a String object has already been created to represent a previous occurrence of that literal, or if the String.intern method has been invoked on a String object representing the same string as the literal.
This may not occur - This is a hint, that there's something special about String objects. Usually, invoking a constructor will always create a new instance of the class. This is not the case with Strings, especially when String objects are 'created' with literals. Those Strings are stored in a global store (pool) - or at least the references are kept in a pool, and whenever a new instance of an already known Strings is needed, the vm returns a reference to the object from the pool. In pseudo code, it may go like that:
1: a := "one"
--> if(pool[hash("one")] == null) // true
pool[hash("one") --> "one"]
return pool[hash("one")]
2: b := "one"
--> if(pool[hash("one")] == null) // false, "one" already in pool
pool[hash("one") --> "one"]
return pool[hash("one")]
So in this case, variables a and b hold references to the same object. IN this case, we have (a == b) && (a.equals(b)) == true.
This is not the case if we use the constructor:
1: a := "one"
2: b := new String("one")
Again, "one" is created on the pool but then we create a new instance from the same literal, and in this case, it leads to (a == b) && (a.equals(b)) == false
So why do we have a String pool? Strings and especially String literals are widely used in typical Java code. And they are immutable. And being immutable allowed to cache String to save memory and increase performance (less effort for creation, less garbage to be collected).
As programmers we don't have to care much about the String pool, as long as we keep in mind:
(a == b) && (a.equals(b)) may be true or false (always use equals to compare Strings)
Don't use reflection to change the backing char[] of a String (as you don't know who is actualling using that String)
When the JVM loads classes, or otherwise sees a literal string, or some code interns a string, it adds the string to a mostly-hidden lookup table that has one copy of each such string. If another copy is added, the runtime arranges it so that all the literals refer to the same string object. This is called "interning". If you say something like
String s = "test";
return (s == "test");
it'll return true, because the first and second "test" are actually the same object. Comparing interned strings this way can be much, much faster than String.equals, as there's a single reference comparison rather than a bunch of char comparisons.
You can add a string to the pool by calling String.intern(), which will give you back the pooled version of the string (which could be the same string you're interning, but you'd be crazy to rely on that -- you often can't be sure exactly what code has been loaded and run up til now and interned the same string). The pooled version (the string returned from intern) will be equal to any identical literal. For example:
String s1 = "test";
String s2 = new String("test"); // "new String" guarantees a different object
System.out.println(s1 == s2); // should print "false"
s2 = s2.intern();
System.out.println(s1 == s2); // should print "true"