String Creation in Java (Memory Usage)

String Creation in Java (Memory Usage) - java

In Java, is there a difference between the following two pieces of code? I'm looking for answers in terms of memory usage and the String pool.
The first:
String s = new String();
s = "abcdef";
The second:
String s = new String("abcdef");
Thanks.

You do a creation and a value assignment in the first one. In the second one you just do a creation. You make (nearly) twice processor activities in the first one. Speaking of memory, there's no difference.
And String pool explanation to your question:
What is the Java string pool and how is "s" different from new String("s")?

Related

How many Strings are formed? [duplicate]

This question already has answers here:
How many string objects will be created in memory? [duplicate]
(4 answers)
Closed 3 days ago.
String a="hello";
String b=a+"Bye";
How many Strings are formed?
From my understanding of Java.
What happens in this code is:
String a="hello"; // hello is created in string pool
String b=a+"bye"; // new StringBuilder(a).append("bye")
So totally 2 strings are to be created, right?
1.Hello
2.HelloBye (In the Heap)
Or does Java create 3?
1.Hello
2.Bye
3.HelloBye
If this is the case, does append method create the appending strings in the string pool?

String a = "hello";
JVM will create one string in the string pool. (FIRST STRING IN POOL)
Now, here comes the tricky part>
b = a + "bye";
Internally + operator uses StringBuffer for concatenating strings.
String b= new StringBuilder(a).append("bye").toString(); (The toString() method of StringBuilder is returning a new String which will be definitely in the Heap since it is created with new String(...). So "bye" will be SECOND STRING IN POOL.)
Now,
b="hellobye" ("hellobye" will be THIRD STRING IN POOL)

First string "hello" is created and added to the string pool.
Next, the String "Bye" is created and added to the string pool.
The concatenation of a and "Bye" results in a new String "helloBye",
which is also added to the string pool.
A total of 3 Strings will be created in the pool: "hello", "Bye",
and "helloBye".
When you create a new StringBuilder and append a string to it, the resulting string will not be added to the string pool. Instead, a new String object will be created in the heap memory to represent the combined string.
So, the code new StringBuilder(a).append("bye") will create one new String object in the heap memory to represent the combined string and one string in pool for "a".

The only part of your question that can be answered with complete certainty is this:
Does append method create the appending strings in the string pool?
The answer is No. The result of a string concatenation that is not a constant expression is not placed in the string pool. At least not in any implementation of mainstream Java to date. However, there is no specification that actually guarantees this.
There are a couple of reasons why we don't know for sure how many strings are "formed".
We don't know when the String objects corresponding to the literals are actually created. In some Java implementation they will be created (and interned) when the code is loaded. In others, the string creation could occur the first time this code is run.
We don't know whether one or both of those literals are used by another class ... and hence whether this code is "forming" them.
Depending on the Java implementation, interning a string (to put it in the string pool) may result in a new String object being created. So you might get a scenario where two String objects get "formed" for each literal.
In short there is enough ambiguity that we cannot be 100% sure of the precise number of strings that are created during the execution of that code.
Does it matter that we don't know for sure?
Frankly, no. It should make zero difference to the way that you write your code1. Let the Java compiler and runtime take care of it ... and use a recent version of Java to get the benefit of the work they have done on optimizing this.
1 - But it is still wise to avoid string concatenation loops. I don't know if they can be optimized.
In your commented version you wrote:
String a = "hello"; // hello is created in string pool
String b = a + "bye"; // new StringBuilder(a).append("bye")
Both of those comments are questionable:
The "hello is created in string pool" comment is questionable for reasons that I gave above.
The new StringBuilder(a).append("bye") pseudo-code is questionable because that is an implementation detail. In Java 9 and later, expressions that involve string concatenations are translated to a invokedynamic bytecode. The JIT compiler generates native instructions directly. See How much does Java optimize string concatenation with +? for more information.

How Java String pool works? How does Java decide whether to use it or not? [duplicate]

This question already has answers here:
What is the difference between "text" and new String("text")?
(13 answers)
Closed 3 years ago.
I know that there's a String pool which is supposed to keep some created strings in order to not duplicate them. So, if a user wants to create a string with the same value as another string, it won't be created once again (unless the new String() was called), it'll be a reference to the same object.
So, my question is why the result of this code is "false false"?
String a = "string1";
String b = "string1";
String c = new String("string1");
System.out.println(a==b);
System.out.println(a==c);
What interests me is WHY it's that way, not how to make Java use the pool.

The correct output for the above code is true false.
And the answer to why is string pool there is to simply optimise the memory usage. Whats the point of storing same string every time in heap memory when it can be saved once in a pool and used as long as JVM runs.
On the other hand when we are explicitly mentioning java to create an new object String s = new String("test") then it should be created as a new object and should be stored separately in heap(not in the string pool) and thereby can be updated every time when referencing this particular reference ( object s) which will not affect the string pool at all.
Other reason why string pool concept works fine for Strings is associated with the immutability of string in java.
And coming on how to decide on when to use what ?
Java recognises and stores every string literals in string pool .
If in your particular usecase there is a lot of playing involved with strings, you should be using literals carefully because it may eventually cause memory error if your code is creating massive amounts of strings in string pool. Also while working with concatenation of heavy string objects, it should be totally avoided.
String a = "Testing"
String b ="this"
String c = "I am " + a + b + "code";
Scenarios like this should be handled with stringbuffer or stringbuilder.
In all, Massive use of string pooling should be avoided. On should switch to string builder instead when using such scenarios. Things like string constants like - "HEADER" , "http://" etc that are being used multiple times are still good to be used as string literals.

How many String objects are created on the Heap [duplicate]

This question already has answers here:
Counting String objects created by Java code
(13 answers)
Closed 7 years ago.
I was asked a question in an interview- How many objects are created on the Heap in the following:
String s1= "A";
String s2= "A";
String s3= new String("A");
I answered 1 - because with the new operator only, a string object is created. When the compiler encounters s1, it will simply create "A" on the string literal pool. And s1 and s2 point to the same literal in the literal pool. But the interviewer confused me by saying that where does this pool exists?
Now, in a certain blog, I read:
"In earlier versions of Java, I think up-to Java 1.6 String literal pool is located in permgen area of heap, but in Java 1.7 updates its moved to main heap area."
So in this way, all the 3 string objects are created on the Heap. Isn't it?
But s1 and s2 point to the same literal in the string literal pool(s1==s2 is true), so a separate object shouldn't be created when s2 is encountered. So in this manner, only 2 objects should be created.
Could someone clarify as such how many String objects are created on the Heap? Am I missing something?

You are correct. One String object is created by String s3= new String("A"); and put into memory heap. One string literal "A" will be put into String pool.
The allocation will be in the heap but it will still store the String literals separately and object created using new separately.
In earlier version of Java, I think up-to Java 1.6 String pool is located in permgen area of heap, but in Java 1.7 updates its moved to main heap area. Earlier since it was in PermGen space, it was always a risk to create too many String object, because its a very limited space, default size 64 MB and used to store class metadata e.g. .class files. Creating too many String literals can cause java.lang.OutOfMemory: permgen space. Now because String pool is moved to a much larger memory space, it's much more safe.
Source: String-literal and String-object

The answer is 1. "A" is added to the heap before any of the 3 lines run via the String Pool, which exists in the heap. The first two lines reference those existing values from the string pool. The third line forces the creation of a new object on the heap.
Here's a great write-up:
http://www.journaldev.com/797/what-is-java-string-pool
Note: I stand corrected on the comment below. The "A" already exists in the thread pool before line 1 ever runs, so nothing is actually added in line 1. Therefore, the net change to the heap is 1 as you said in the interview since only line 3 actually affects the heap.

String initialisation and concatenation in java

In my application everything is working fine but I want to increase performance and optimize my code.
which of these two is better for
1.initialisation
String str1=new String("Hello");
String str2="Hello";
2.concatenation
System.out.println(s1 + s2);
System.out.println(new StringBuffer(S1).append(s2));

First of all, do not increase performance and optimize your code, unless you first profiled your application and realized a very good reason to do so.
Second, for initialization of a String variable it is better to not use the String constructor. Using a constant string (as done for str2), Java can pull the String object out of a String pool.
Third, do not use StringBuffer for concatenation. Use StringBuilder instead. StringBuffer's methods are synchronized, which slows down your application significantly. Indeed, your two kinds of concatenation are nearly equal, as all modern compilers create byte code, that uses a StringBuilder for expressions like "s1 + s2".

For the initialization it is better the second approach :
String str2="Hello";
because in this way you can make use of the Java String Pool and avoid not needed allocations .
For concatenation the second approach would be the best bet when you have to perform a lot of string concatenation, to concatenate only two string, the first approach is simpler and enough...

Use
String str2="Hello";
for string initialization, because if "Hello" string is avaialable in JVM string pool then new memory object will not be created
Two other suggestions:
If you are manipulating string then use StringBuffer as it does not create new strings with each string manipulation as String class does.
If your application is thread safe then use StringBuilder to avoid unnecessary overhead of StringBuffer, which is designed for multi-threaded operations.

For initialization it is better to use the second version because that will enable the JVM the String "interned", that means it can always return the same String-instance every time that constant is used. The first version will always create a new String object when this code is encountered, thus creating extra memory-consumption.
For concatenation, in simple cases like your example the compiler will do optimization so both ways will end up essentially the same. For more complicated String-concatenations it is better to either use a Stringbuffer or a StringBuilder. Use of a StringBuffer is necessary when the StringBuilder is accessed from multiple threads, in other cases StringBuilder will give better performance because it won't do any locking.

In initialization
String str2="Hello";
is better approach
In concatenation
System.out.println(s1 + s2);
is better approach.
Beacuse both they use String Constant pool which is ment for performance improvement

System.out.println(s1 + s2);
System.out.println(new StringBuffer(S1).append(s2));
From those two above, first would be faster, because + is translated into StringBuilder, that is faster compared to StringBuffer
And anyway... fastest, but some kind nasty-looking, way of adding 2 Strings is to use string1.concat(string2) method, that does not need to produce new object of Stringbuilder of Buffer.
You can also reuse the same StringBuilder for adding many Strings, by reseting it with sb.setLength(0) after each fully-added-String
:
StringBuilder sb = new StringBuilder();
String done1 = sb.append("1").append("2").append("3").toString();
sb.setLength(0);
String done2 = sb.append("4").append("5").append("6").toString();
sb.setLength(0);
String done3 = sb.append("7").append("8").append("9").toString();
sb.setLength(0);
System.out.println(done1);
System.out.println(done2);
System.out.println(done3);
Lastly, inside loops, you should always use StringBuilder/Buffer explicitly, ignoring that magic about using +. Because you would end up with many temporally StringBuilder objects, instead of only one that you should explicitly create before loop.
//not:
String result = "";
for (int i = 0; i < 20; i++) {
result += i; // this would create new StringBuilding in bytecode
}
System.out.println(result);
//but:
StringBuilder result1 = new StringBuilder();
for (int i = 0; i < 20; i++) {
result1.append(i);
}
System.out.println(result1);

in initialization Second approach is good as it only creates one object
String str1=new String("Hello");
Here two objects are getting created one in heap and other one in String pool
String str2="Hello";
here only one object is getting created in String pool.
System.out.println(s1 + s2);
Here total three objects are there s1 s2 and s1+s2 all in String pool
System.out.println(new StringBuffer(S1).append(s2));
Here only one object in head area which is S1+S2 so in both cases second approach is good

1)Go for literals,String literals are stored in a common pool. This facilitates sharing of storage for strings with the same contents to conserve storage. String objects allocated via new operator are stored in the heap, and there is no sharing of storage for the same contents.
other than
Literals are reader friendly ,than using constructors.
2)Go for StringBuilder instead of + which avoid's multiple string object creations.
For the second point ,with 2 to 3 appends or +'s ,there is no much difference.But when you are appending a 50 strings to one another,It matters'.
Might helpful :
How can a string be initialized using " "?
String literal behavioral specification.
Most of the memory related issues maintains/resolves by java itself.I belive in clean and readable code unless it's showing major impact.

Don't optimize before you REALLY need this.
If you optimize when it is not needed you decrease readability and waste time. It is really rare case that string initialization will cause performance problems for you.

Avoid creating 'new' String objects when converting a byte[] to String using a specific charset

I'm reading from a binary file and want to convert the bytes to US ASCII strings. Is there any way to do this without calling new on String to avoid multiple semantically equal String objects being created in the string literal pool? I'm thinking that it is probably not possible since introducing String objects using double quotes is not possible here. Is this correct?
private String nextString(DataInputStream dis, int size)
throws IOException
{
byte[] bytesHolder = new byte[size];
dis.read(bytesHolder);
return new String(bytesHolder, Charset.forName("US-ASCII")).trim();

You'd have to have a cache mapping byte arrays to strings, then search through the cache for any equal values before creating a new string.
You can intern existing strings with intern() as Yishai posted - that won't stop you from creating more strings, but it'll make all but the first one (for any char sequence) very short lived. On the other hand, it'll make all the distinct strings live for a very long time indeed.
You can have "pseudo-interning" by using a Map<String, String>:
String tmp = new String(bytesHolder, Charset.forName("US-ASCII")).trim();
String cached = cache.get(tmp);
if (cached == null)
{
cached = tmp;
cache.put(tmp, tmp);
}
return cached;
You could even put a bit more effort in and end up with an LRU cache so that it'll keep the N most recently fetched strings, discarding others when it needs to.
None of that reduces the number of strings created in the first place, as I say - but is that likely to be a problem in your situation? GCs have been tuned to make it very cheap to create short-lived objects.

You can call the intern() method on the string to ensure one for the whole JVM.
String s = new String(bytes, "US-ASCII").intern();
You won't avoid creating the initial string again, but you will save on the storage.
That being said, interned strings have a limited storage space, so use with caution. A better option may be to implement a HashMap with the string as the key and value and check if the string already exists and get it if it does, insert it if it doesn't. That way you won't have such memory limitations.

You shouldn’t be concerned about it—unless you profiled your application and have determined the String creation to be the exact source of your problem.
If you find out that the String creation is the source of your problem I would recommend what Jon Skeet proposed, i.e. a mapping from byte[] to String. That has about the same effect as interning your Strings while not hogging up valuable memory until you restart the VM.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.