This question already has answers here:
String concatenation: concat() vs "+" operator
(12 answers)
Closed 1 year ago.
What does actually happens when we concatenate a string S2 of size Y to a string S1 of size X ( already present on the heap) using the + operator?
This is what I think:
If I execute the following function:
class StringConcatenation{
String S1;
String concat(String S2){
this.S1 = this.S1 + S2;
}
}
If S1 was present in the string pool (which is stored in the heap) and we execute the concat method, then the method gets executed on the stack.
So, the CPU will need to copy the S1 in the stack => READ S1
As strings are immutable, Java must make a new object (let's name its reference as S3).
Now, contents of S1 and S2 are copied at new object => COPY S1 + COPY S2
Then the reference S1 points to the new object.
Therefore, the total time complexity is O(READ S1 + COPY S1 + COPY S2 ) = O(X + X + Y) = O(2*X + Y).
Is my thought process correct?
in the doc you can read:
The Java language provides special support for the string concatenation operator ( + ), and for conversion of other objects to strings. String concatenation is implemented through the StringBuilder(or StringBuffer) class and its append method.
a = a + b is the equivalent of a += b or:
a = new StringBuilder()
.append(a)
.append(b)
.toString();
Disclaimer: The String class has undergone multiple changes to improve performance and space utilization. What happens when JIT compiles code is then entirely undefined. The following is a simplification, and ignores any optimizations that may or may not be applied.
String is a class that encapsulates a char[]. The array length is always exactly the length() of the string. The class, and the underlying array, is immutable.
class String {
private final char[] arr;
}
StringBuilder (and StringBuffer) is another class that encapsulates a char[], but the array is almost always larger than the number of characters in the array. The class, and the array, is mutable.
class StringBuilder {
private char[] arr;
private int len;
}
When you do string concatenation with the + operator, the compiler generates that as:
// Java code
s = s1 + s2 + s3;
// Generated code
s = new StringBuilder().append(s1).append(s2).append(s3).toString();
StringBuilder will initially create the array with length 16, and will re-allocate the array when needed. Worst case is that s1, s2, and s3 are all too large for the current array, so each append() call needs to re-size the array.
This means that the would progress as follows:
new StringBuilder() - Creates char[16].
append(s1) - Resizes arr, then copies chars from s1.arr to the array.
append(s2) - Resizes arr, copies existing content (chars from s1) to new array, then copies chars from s2.arr to the array.
append(s3) - Resizes arr, copies existing content (chars from s1 and s2) to new array, then copies chars from s3.arr to the array.
toString() - Create new String with char[] sized to exactly fit the characters in the StringBuilder, then copies the content (chars from s1, s2, and s3) to the new String.
All-in-all the chars from s1 ends up being copied 4 times.
If the string concatenation is S1 + S2, like in the question, then the characters from S1 are copied 2 or 3 times, and the characters from S2 are copied 2 times.
Since time complexity is generally worst case, that means O(3m + 2n), not the O(2m + n) suggested in the question. Of course, Big-O eliminates constant factors, so it is actually O(m + n).
Related
This question already has answers here:
Strange behavior with string interning in Java
(4 answers)
Closed 5 years ago.
I was reading about String in java and was trying to understand it.
At first, it was easy how String s1="11" and String s2=new String ("11") works(created) and I understood intern method also.
But I came across this example (Given by a friend) and made me confused about everything.
I need help to understand this.
String s1 = new String(new String("2")+new String("2"));
s1.intern();
String s2="22";
System.out.print(s1==s2); //=>true as output.
String s3 =new String (new String("2")+new String("2"));
s3.intern();
String s4="22";
System.out.print(s3==s4); //=>false as output.
Answer of this code is true and false.
Part for S1 and s2 was good and was true according to my understanding but the second part I didn't understand.
Hope someone can break the code line by line and help me understand.
s1.intern(); adds s1 to the pool of strings, therefore the string "22" is now in the pool of strings. Therefore when you write s2 = "22" that's the same "22" as s1 and thus s1 == s2.
s3.intern() does NOT add s3 to the pool of strings because the string "22" is already there.
s3.intern() does return that same "22" which is s1 BUT IT IS NOT USED. Therefore s3 is not equal s4.
In java exist the heap and the stack,
Heap is where all Objects are saved
stack is where vars are saved
Now also exist another kind of list for Strings and Integers (numbers)
As you know a String can be created in some ways like
like new String("word") or just = "word" when you use the first way you create a new object (heap) when you use the other you save the word in a stack of words (Java engenniers thought it would be good if you don't create manny objects or words are repeated so they created an special stack for words, same for Integers from 0 to 127) So as I said You have to know that there is an stack and a Heap look at this example
String wordOne ="hola";
String wordTwo = "hola";
String wordTres = "hola";
System.out.println(wordOne == wordTwo);
System.out.println(wordTres == wordTwo);
System.out.println(wordOne == wordTres);
String wordFour = new String("hola");
System.out.println(wordOne == wordFour);
Integer uno = 127;
Integer dos = 127;
System.out.println(uno == uno);
Integer tres = 128;
Integer cuatro = 128;
System.out.println(tres == cuatro);
String x = "word"; is saved in an special Stack
String y = new String("it is not");
But tbh I don't remeber so well the rules for tha stack, but in any case i recomend you to compare all words using wordX.equals(wordY)
An also numbers in objects could be compared using == from 0 to 127 but the same if you use objects use equals, although using numbers there is a better do to do it in spite of use equals, convert one number to a primitive value (the memory will be better)
When you are making string with new keyword,JVM will create a new string object in normal(non pool) heap memory and the literal will be placed in the string constant pool. In your case, The variable s1 will refer to the object in heap(non pool).
String s1 = new String(new String("2")+new String("2"));
But in the next line your are calling intern() method.
When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
Check Javadocs.
As "22" is not in string pool, a new string literal "22" will be created and a reference of it will be returned. When you are writing:
String s2="22";
it simply refers "22" in string pool. But calling s3.intern() will not create a new string literal as "22" exists in the pool. Check the Javadocs for intern() again. It says if exists in pool, then string from the pool is returned not reference. So, this time s3 references to a different object.
But s4 is referred to same object as s1,s2.
You can print the objects hashcode for checking if the are same or not. Like:
System.out.println(System.identityHashCode(s1));
Notice that the type String is capitalized and is not one of Java's 8 primitive types (int, boolean, double, char, etc.). This indicates that any instance of a String is an object that was built using the 'blueprint' of the class String. Because variables in Java that refer to objects only store the memory address where the actual object is stored, when you compare Strings with == it compares memory location.
String str1 = new String("hello");
String str2 = str1; //sets str1 and str2 pointing to same memory loc
if (str1 == str2){
//do stuff; the code will enter this if-statement in this case
}
The way to compare the values within objects in Java is with equals(), such as:
String str1 = new String("hello");
String str2 = new String("hello"); //str2 not same memory loc as str1
if (str1.equals(str2)){
//do stuff; the code will enter this if-statement in this case
}
This is a common error for beginners, since the primitive types are not objects and you CAN compare two ints for equality like:
int one = 1; //primitive types are NOT objects
int two = 2; //notice when I make an int, I don't have to say "new"
//which means a new **object**
if (int1 == int2) {
//do stuff; in this case the program will not enter this if-statement
}
It seems that you understand everything but the meaning of the very last line. See my comment on the last line.
String s1 = new String(new String("2")+new String("2")); //declare AND initialize s1 as a new String object
s1.intern();
String s2="22"; //declare a new variable s2 and point it to the same object that s1 is pointing to
System.out.print(s1==s2);
String s3 =new String (new String("2")+new String("2"));
s3.intern();
String s4="22";
System.out.print(s3==s4); //check if s3 and s4 are stored in the same memory location = FALSE
In java object1 == object2 means
that do object1 and object2 have the same address in memory?
object1.equals(object2)
means are they equal, for example do they have the same values of all fields?
So, For two Strings S1 and S2,
string1.equals(S2) means, do they have the same characters in the same sequence?
S1 == S1 means are string1 and string2 stored at the same address in memory?
I know that When we use String literals as given below the String object is created in String pool (if it doesn't exist).
String str1= "hello";
String str2= "hello";
In above case only one string object will be created in pool.
But, when we use new keyword it always creates a new String object in heap memory (even though there is one in String pool)
String str3=new String("hello"); // here a new object will be created in heap.
Here, i have one confusion regarding how many objects will be created in below cases and where (pool or heap memory).
1) String s="Hello";
String s1 = new String ("Hello");
2) String s = new String("Hello");
String s1 = new String("Hello");
3) String s="Hello";
String s1=new String (s);
4) String s1 = new String ("Hello");
String s="Hello";
Every invocation of new String(...) will create a new instance. You can use String.intern() to get an instance from the pool.
String s="Hello";
String s1 = new String ("Hello");
System.out.println(System.identityHashCode(s)==System.identityHashCode(s1));
String si= new String ("Hello").intern();
String s1i = new String ("Hello").intern();
System.out.println(System.identityHashCode(si)==System.identityHashCode(s1i));
This prints false and true
We can account for memory in Java’s String objects in the same way as
for any other object, except that aliasing is common for strings.
The standard String
implementation has four instance variables: a reference to a character array (8 bytes)
and three int values (4 bytes each). The first int value is an offset into the character array;
the second is a count (the string length).
In terms of the instance variable names in
the drawing on the figure, the string that is represented consists of the characters
value[offset] through value[offset + count - 1]. The third int value in String
objects is a hash code that saves recomputation in certain circumstances.
Therefore, each String object uses a total of 40 bytes (16 bytes for
object overhead plus 4 bytes for each of the three int instance variables plus 8 bytes for
the array reference plus 4 bytes of padding).
This space requirement is in addition to
the space needed for the characters themselves, which are in the array. The space needed
for the characters is accounted for separately because the char array is often shared
among strings. Since String objects are immutable, this arrangement allows the implementation
to save memory when String objects have the same underlying value[].
String values and substrings.
A String of length N typically uses 40 bytes (for the
String object) plus 24 2N bytes (for the array that contains the characters) for a
total of 64 + 2N bytes. But it is typical in string processing to work with substrings, and
Java’s representation is meant to allow us to do so without having to make copies of the string's characters!
Source: Algorithms 4th Edition
How many objects are being created for your examples?
1) 4
2) 4
3) 3
4) 4
Note that each String object contains a char-array with the content of the string. So when creating a new String you actually create two objects.
1) 2) and 4)
Each line in your examples either creates a String in the pool which contains a char-array (therefore we have two objects) or create a new String which - again - contains a char-array. Note that in neither of these examples the strings share any of the content.
3)
This example is different since we use the first String (2 objects) to create the second String. In this case the second String will be a new object but it will use the very same char-array as the first one, therefore not creating a new one. This leads to a total of only 3 objects instead of 4.
One more example
String s1 = "Hello";
String s2 = "Hello";
In this case we will have only 2 objects, since s1 and s2 will both point to the same String object in the pool with the same char-array.
This question already has answers here:
How do I compare strings in Java?
(23 answers)
Closed 8 years ago.
String s1 = new String("anil");
String s2 = s1.toUpperCase();
String s3 = s1.toLowerCase();
System.out.println(s1 == s2);
System.out.println(s1 == s3);
if string object created in heap then both are false.But it gives false,true.
String s1 = new String("anil");
This statement creates a new object
And this ,
String s3 = s1.toLowerCase();
points the location of 1st object that is s1
And thats the reason you are getting true for second condition
Also see how java handles strings to get a clear understanding
Hope this helps!!
There are four String objects here:
the literal, created by the compiler and classloader
s1, created by new String()
s2, created by toUpperCase()
s3, created by toLowerCase().
No two of them are equal via the == operator.
Except that toLowerCase() may return the same object if it is already lowercase. There's nothing in the Javadoc about that, so any such behaviour in an implementation cannot be relied on.
Here S1 object will be created in heap. Its value is stored in the constant string pool.
S2 is String literal not an object. So first JVM will check whether the string is there in constant pool. If String is there constant pool it will not create new object. It will return reference of the object available.
Here the s1.toUpper will return "ANIL". "ANIL" is not in the constant pool. so new object will be created. and comparing it with s1 (using'==') give false.
Same for S3. But for S3 it wont create new object as "anil" is already there in constant pool.
so will return the reference of S1. So it gives true.
Study the following link
Study this
Case 1: String with Capital First Letter.
> String s1 = new String("Ajay")
String s2 = s1.toUpperCase()
String s3 = s1.toLowerCase()
System.out.println s1 == s2
System.out.println s1 == s3
false
false
Case 2: String with Small First Letter.
> String s1 = new String("ajay")
String s2 = s1.toUpperCase()
String s3 = s1.toLowerCase()
System.out.println s1 == s2
System.out.println s1 == s3
false
true
in Case 1, since the string has capital letter, converting to lowercase will yield a new object hence a new reference for it while in Case 2 small first letter after converting to lowercase will still point to the same object because the original object was same hence creating two references for the same object.
You can see the output from the Groovy Shell pretty clear.
If you look at the toLowerCase() method in String class.
It calls toLowerCase(Locale locale)
toLowerCase(Locale locale) inturn uses Character.toLowerCase(c)
Character.toLowerCase(c) in Character class has this comment -
#param ch the character to be converted.
#return the lowercase equivalent of the character, if any;
otherwise, the character itself.
This question already has answers here:
String can't change. But int, char can change
(7 answers)
Closed 8 years ago.
I am confuse about String and String Builder. Here is my simple code
StringBuilder sb1 = new StringBuilder("123");
String s1 = "123";
sb1.append("abc");
s1.concat("abc");
System.out.println(sb1 + " " + s1);
sb1 output for 123abc. It is ok! because it use append method.But String s1 should be abc123
but it output is abc. Why? And what is concat method purpose? Please explain me.
Thank you
.But String s1 should be abc123 but it output is abc.
Strings are immutable in Java. concat doesn't change the existing string - it returns a new string. So if you use:
String result = s1.concat("abc");
then that will be "123abc" - but s1 will still be "123". (Or rather, the value of s1 will still be a reference to a string with contents "123".)
The same is true for any other methods on String which you might expect to change the contents, e.g. replace and toLowerCase. When you call a method on string but don't use the result (as is the case here), that's pretty much always a bug.
The fact that strings are immutable is the whole reason for StringBuilder existing in the first place.
concat function not change the string but it returns the result which is not assigned in your case:
String concat(String textToAppend)
so change:
s1 = s1.concat("abc");
string objects are immutable. Immutable simply means unmodifiable or unchangeable
but if you give
String result = s1.concat("abc");
output is 123abc
and
StringBuilder are mutable
you can perform changes
s1.concat("abc") will create a new object in heap with the "abc" concatenated to s1. but s1 is still pointing to original s1 which is "123". so you need to make your s1 reference to point to new object using s1 = s1.concat("abc");
I'm trying to understand if this code below creates 12 objects for a string like "stephan"
public String reverse(String str) {
if ((null == str) || (str.length() <= 1)) {
return str;
}
return reverse(str.substring(1)) + str.charAt(0);
}
This recursively reverses a string. I understand how it works. But I was thinking if there is a relationship in this case between the length of the strings and number of string objects created through concatenation?
Yes, it will create tons of string objects.
Every recursive call to "reverse()" will create 2:
str.substring(1) will create a new String object
reverse() call will create a new string for its return value, but we will NOT count that since that's counted when analyzing that recursive call (e.g. it will be the string from bullet point #3 from the next reverse() call).
And since Java Strings are immutable, adding a char via "+" will create a second String object.
Therefore, for a string of length N, it will create (N-1)*2 objects (since a reverse of 1-char string does NOT create new strings); so for "stephan"'s 7 characters, it will create 6*2=12 string objects.
Theorem:
When a string is N characters long, #Phoenix's reverse implementation will create (N-1)*3 new objects.
Proof (by induction):
When str is 1 character long, it is returned directly. (1*1)*3 = 0.
When str is N characters long:
a new String will be created by .substring(1).
by the induction hypothesis, the call to reverse(...) will be returned after (N-2)*3 objects have been created.
a new StringBuilder will be created to append the string and first char (you can see this by de-compiling your byte-code).
a new String will be created by StringBuilder.toString()--this is the return value.
Altogether, there were 3 + (N-2)*3 = (N-2 + 1)*3 = (N-1)*3 objects created.
QED.
[Edit] StringBuilders:
StringBuilder (extending AbstractStringBuilder) does its own fancy footwork:
When an StringBuilder is constructed, it is initialized with a char[] of size 16.
When you append something more than it's present size, it throws that away and creates a new char[] of size (<old size> + <size of new data> + 1) * 2.
So, as soon as your input string is > 16 characters, you have essentially 2x as much StringBuilder capacity as you need. (When the input string size is less, you've got more char[] than you need.)
Considering Strings are essentially char[]s (with a few ints for good measure), you're effectively using 4 times the length of the substring in char[]s -- at each step. :(