How many Strings are formed? [duplicate] - java

This question already has answers here:
How many string objects will be created in memory? [duplicate]
(4 answers)
Closed 3 days ago.
String a="hello";
String b=a+"Bye";
How many Strings are formed?
From my understanding of Java.
What happens in this code is:
String a="hello"; // hello is created in string pool
String b=a+"bye"; // new StringBuilder(a).append("bye")
So totally 2 strings are to be created, right?
1.Hello
2.HelloBye (In the Heap)
Or does Java create 3?
1.Hello
2.Bye
3.HelloBye
If this is the case, does append method create the appending strings in the string pool?

String a = "hello";
JVM will create one string in the string pool. (FIRST STRING IN POOL)
Now, here comes the tricky part>
b = a + "bye";
Internally + operator uses StringBuffer for concatenating strings.
String b= new StringBuilder(a).append("bye").toString(); (The toString() method of StringBuilder is returning a new String which will be definitely in the Heap since it is created with new String(...). So "bye" will be SECOND STRING IN POOL.)
Now,
b="hellobye" ("hellobye" will be THIRD STRING IN POOL)

First string "hello" is created and added to the string pool.
Next, the String "Bye" is created and added to the string pool.
The concatenation of a and "Bye" results in a new String "helloBye",
which is also added to the string pool.
A total of 3 Strings will be created in the pool: "hello", "Bye",
and "helloBye".
When you create a new StringBuilder and append a string to it, the resulting string will not be added to the string pool. Instead, a new String object will be created in the heap memory to represent the combined string.
So, the code new StringBuilder(a).append("bye") will create one new String object in the heap memory to represent the combined string and one string in pool for "a".

The only part of your question that can be answered with complete certainty is this:
Does append method create the appending strings in the string pool?
The answer is No. The result of a string concatenation that is not a constant expression is not placed in the string pool. At least not in any implementation of mainstream Java to date. However, there is no specification that actually guarantees this.
There are a couple of reasons why we don't know for sure how many strings are "formed".
We don't know when the String objects corresponding to the literals are actually created. In some Java implementation they will be created (and interned) when the code is loaded. In others, the string creation could occur the first time this code is run.
We don't know whether one or both of those literals are used by another class ... and hence whether this code is "forming" them.
Depending on the Java implementation, interning a string (to put it in the string pool) may result in a new String object being created. So you might get a scenario where two String objects get "formed" for each literal.
In short there is enough ambiguity that we cannot be 100% sure of the precise number of strings that are created during the execution of that code.
Does it matter that we don't know for sure?
Frankly, no. It should make zero difference to the way that you write your code1. Let the Java compiler and runtime take care of it ... and use a recent version of Java to get the benefit of the work they have done on optimizing this.
1 - But it is still wise to avoid string concatenation loops. I don't know if they can be optimized.
In your commented version you wrote:
String a = "hello"; // hello is created in string pool
String b = a + "bye"; // new StringBuilder(a).append("bye")
Both of those comments are questionable:
The "hello is created in string pool" comment is questionable for reasons that I gave above.
The new StringBuilder(a).append("bye") pseudo-code is questionable because that is an implementation detail. In Java 9 and later, expressions that involve string concatenations are translated to a invokedynamic bytecode. The JIT compiler generates native instructions directly. See How much does Java optimize string concatenation with +? for more information.

Related

When to avoid string interning

I've started looking into string interning and it seems like a great feature however I haven't found a great reason for why you would want to create a string using the string constructor, after some digging I came up with this, could someone confirm (or deny) if this is a valid reason to create a string with new?
Say you have 2 strings:
String novel = "The contents of a very long novel..."
String page = new String("The contents of a single page...")
By default all string literals are stored in the string pool (such as with String novel) and by default all sub-strings of novel will be interned (assuming they are created as a string literal) to optimizing memory allocation. Creating a string using the new keyword results in the string being created on the heap rather than in string pool. A particular case when you may want to avoid interning is if you wanted to create a string that is a sub-string of a very large string literal (such as page).
For example; Say you had a very large string literal (e.g. the contents of a novel) that you wanted to process only a portion of (e.g. a single page). It may be beneficial to use the string constructor (via new keyword) when creating the string that only contains a single page of the novel. That way the very large string may be free'd from the string pool sooner and keep only the string that contains the contents of a page on the heap. In contrast, if you created a string literal that is an interned sub-string of an entire novel, a larger amount of novel may be kept alive in the string pool despite only needing a small portion of the novel string.
TL;DR: There is no good / valid reason to new a String in a modern JVM, or to call String.intern() explicitly.
Your question contains false statements of fact, and that means that the conclusions that you are drawing are incorrect.
By default all string literals are stored in the string pool (such as with String novel)
That is correct, though it is not "by default". (It is like saying "by default a square has 4 sides". Squares have 4 sides, period. There are no exceptions. And no defaults.)
and by default all sub-strings of novel will be interned (assuming they are created as a string literal) to optimizing memory allocation.
Incorrect.
A String created by the String.substring() method is NOT interned. Not in current Java releases, or (AFAIK) in any previous release. (But see below.)
Creating a string using the new keyword results in the string being created on the heap rather than in string pool.
Correct.
A particular case when you may want to avoid interning is if you wanted to create a string that is a sub-string of a very large string literal (such as page).
Incorrect.
I think you are confusing "interning" with something else.
Actually, in a modern JVM you always want to avoid interning. It is expensive, and it causes string objects to be (artificially) kept for longer than they need to me.
In fact, the only real reason that interning is still a thing is that it is necessary to guarantee certain semantic properties specified in the JLS about compile-time constant strings.
A modern JVM (Java 9 and later) performs string deduping in the garbage collector for strings that live long enough. This happens transparently ... and in cases where it is likely to be beneficial.
Historic note.
In some old JVMs, there used to be a good reason to call new String in conjunction with substring. The problem was the substring method has a "clever optimization" whereby it created the substrings to share the backing char[] with the original string1. This had the problem that references to (small) substrings could keep the (large) backing array reachable. It was a subtle kind of memory leak. You could avoid the leak by using new.
However:
The optimization was NOT interning. The substrings were created in the regular heap, and they did not have the semantics of interned strings.
The problem only affected certain String use-cases. And in practice they didn't involve large String literals.
The problem was solved long ago. The String.substring now creates a new String with its own backing array.
In summary, using new String might have been a good idea in some cases with old Java versions, but it isn't anymore. It was fixed in Java 7.
1 - Interestingly, the source code for String describes this as a speed optimization rather than a space optimization.

How Java String pool works? How does Java decide whether to use it or not? [duplicate]

This question already has answers here:
What is the difference between "text" and new String("text")?
(13 answers)
Closed 3 years ago.
I know that there's a String pool which is supposed to keep some created strings in order to not duplicate them. So, if a user wants to create a string with the same value as another string, it won't be created once again (unless the new String() was called), it'll be a reference to the same object.
So, my question is why the result of this code is "false false"?
String a = "string1";
String b = "string1";
String c = new String("string1");
System.out.println(a==b);
System.out.println(a==c);
What interests me is WHY it's that way, not how to make Java use the pool.
The correct output for the above code is true false.
And the answer to why is string pool there is to simply optimise the memory usage. Whats the point of storing same string every time in heap memory when it can be saved once in a pool and used as long as JVM runs.
On the other hand when we are explicitly mentioning java to create an new object String s = new String("test") then it should be created as a new object and should be stored separately in heap(not in the string pool) and thereby can be updated every time when referencing this particular reference ( object s) which will not affect the string pool at all.
Other reason why string pool concept works fine for Strings is associated with the immutability of string in java.
And coming on how to decide on when to use what ?
Java recognises and stores every string literals in string pool .
If in your particular usecase there is a lot of playing involved with strings, you should be using literals carefully because it may eventually cause memory error if your code is creating massive amounts of strings in string pool. Also while working with concatenation of heavy string objects, it should be totally avoided.
String a = "Testing"
String b ="this"
String c = "I am " + a + b + "code";
Scenarios like this should be handled with stringbuffer or stringbuilder.
In all, Massive use of string pooling should be avoided. On should switch to string builder instead when using such scenarios. Things like string constants like - "HEADER" , "http://" etc that are being used multiple times are still good to be used as string literals.

String.valueOf(someVar) vs ("" + someVar) [duplicate]

This question already has answers here:
String valueOf vs concatenation with empty string
(10 answers)
Closed 5 years ago.
I want to know the difference in two approaches. There are some old codes on which I'm working now, where they are setting primitive values to a String value by concatenating with an empty String "".
obj.setSomeString("" + primitiveVariable);
But in this link Size of empty Java String it says that If you're creating a separate empty string for each instance, then obviously that will take more memory.
So I thought of using valueOf method in String class. I checked the documentation String.valueOf() it says If the argument is null, then a string equal to "null"; otherwise, the value of obj.toString() is returned.
So which one is the better way
obj.setSomeString("" + primitiveVariable);
obj.setSomeString(String.valueOf(primitiveVariable));
The above described process of is done within a List iteration which is having a size of more than 600, and is expected to increase in future.
When you do "" that is not going to create an Object. It is going to create a String literal. There is a differenc(How can a string be initialized using " "?) actually.
Coming to your actual question,
From String concatenation docs
The Java language provides special support for the string concatenation operator ( + ), and for conversion of other objects to strings. String concatenation is implemented through the StringBuilder(or StringBuffer) class and its append method.
So unnecissarly you are creating StringBuilder object and then that is giving another String object.
However valueOf directly give you a String object. Just go for it.
Besides the performance, just think generally. Why you concatenating with empty string, when actually you want to convert the int to String :)
Q. So which one is the better way
A. obj.setSomeString(String.valueOf(primitiveVariable)) is usually the better way. It's neater and more domestic. This prints the value of primitiveVariable as a String, whereas the other prints it as an int value. The second way is more of a "hack," and less organized.
The other way to do it is to use Integer.toString(primitiveVariable), which is basically the same as String.valueOf.
Also look at this post and this one too

String objects and reference in java [duplicate]

This question already has answers here:
How many string objects will be created in memory? [duplicate]
(4 answers)
Closed 9 years ago.
String str = "Hello"+"World";
String str1 = str + "hello";
How many objects are created and how many references are created?
String is an immutable object. Whenever you manipulate a String, the JVM creates (at least) a new String and assigns it the new (concatenated) value.
As you did not specify you only care about String objects and references, we need to talk about StringBuffers. StringBuffers are (beside StringBuilders) a class that tries to work around the immutable nature of Strings. We all know, many times we just need to add two or more Strings together.
Imagine this code:
String sentence = "the " + "quick " + "brown " + "fox ";
Often times, when that happens, the Java Compiler will not create these Strings, one at a time adding them together, then forgetting about all intermediary Strings. What happens is that a StringBuffer is created. Then, all single Strings are added by using StringBuffer.append(String), then at the end one String is returned.
What you can say for sure is that 3 String references are created, referencing the inlined (and pooled) Strings "Hello", "World" and "hello". Each reference references a different String. That would have changed if the third word would have been "Hello" as well (uppercase h).

String Creation in Java (Memory Usage)

In Java, is there a difference between the following two pieces of code? I'm looking for answers in terms of memory usage and the String pool.
The first:
String s = new String();
s = "abcdef";
The second:
String s = new String("abcdef");
Thanks.
You do a creation and a value assignment in the first one. In the second one you just do a creation. You make (nearly) twice processor activities in the first one. Speaking of memory, there's no difference.
And String pool explanation to your question:
What is the Java string pool and how is "s" different from new String("s")?

Categories

Resources