Why StringBuilder when there is String? - java

I just encountered StringBuilder for the first time and was surprised since Java already has a very powerful String class that allows appending.
Why a second String class?
Where can I learn more about StringBuilder?

String does not allow appending. Each method you invoke on a String creates a new object and returns it. This is because String is immutable - it cannot change its internal state.
On the other hand StringBuilder is mutable. When you call append(..) it alters the internal char array, rather than creating a new string object.
Thus it is more efficient to have:
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 500; i ++) {
sb.append(i);
}
rather than str += i, which would create 500 new string objects.
Note that in the example I use a loop. As helios notes in the comments, the compiler automatically translates expressions like String d = a + b + c to something like
String d = new StringBuilder(a).append(b).append(c).toString();
Note also that there is StringBuffer in addition to StringBuilder. The difference is that the former has synchronized methods. If you use it as a local variable, use StringBuilder. If it happens that it's possible for it to be accessed by multiple threads, use StringBuffer (that's rarer)

Here is a concrete example on why -
int total = 50000;
String s = "";
for (int i = 0; i < total; i++) { s += String.valueOf(i); }
// 4828ms
StringBuilder sb = new StringBuilder();
for (int i = 0; i < total; i++) { sb.append(String.valueOf(i)); }
// 4ms
As you can see the difference in performance is significant.

String class is immutable whereas StringBuilder is mutable.
String s = "Hello";
s = s + "World";
Above code will create two object because String is immutable
StringBuilder sb = new StringBuilder("Hello");
sb.append("World");
Above code will create only one object because StringBuilder is not immutable.
Lesson: Whenever there is a need to manipulate/update/append String many times go for StringBuilder as its efficient as compared to String.

StringBuilder is for, well, building strings. Specifically, building them in a very performant way. The String class is good for a lot of things, but it actually has really terrible performance when assembling a new string out of smaller string parts because each new string is a totally new, reallocated string. (It's immutable) StringBuilder keeps the same sequence in-place and modifies it (mutable).

The StringBuilder class is mutable and unlike String, it allows you to modify the contents of the string without needing to create more String objects, which can be a performance gain when you are heavily modifying a string. There is also a counterpart for StringBuilder called StringBuffer which is also synchronized so it is ideal for multithreaded environments.
The biggest problem with String is that any operation you do with it, will always return a new object, say:
String s1 = "something";
String s2 = "else";
String s3 = s1 + s2; // this is creating a new object.

To be precise, StringBuilder adding all strings is O(N) while adding String's is O(N^2). Checking the source code, this is internally achieved by keeping a mutable array of chars. StringBuilder uses the array length duplication technique to achieve ammortized O(N^2) performance, at the cost of potentially doubling the required memory. You can call trimToSize at the end to solve this, but usually StringBuilder objects are only used temporarily. You can further improve performance by providing a good starting guess at the final string size.

Efficiency.
Each time you concatenate strings, a new string will be created. For example:
String out = "a" + "b" + "c";
This creates a new, temporary string, copies "a" and "b" into it to result in "ab". Then it creates another new, temporary string, copies "ab" and "c" into it, to result in "abc". This result is then assigned to out.
The result is a Schlemiel the Painter's algorithm of O(n²) (quadratic) time complexity.
StringBuilder, on the other hand, lets you append strings in-place, resizing the output string as necessary.

StringBuilder is good when you are dealing with larger strings. It helps you to improve performance.
Here is a article that I found that was helpful .
A quick google search could have helped you. Now you hired 7 different people to do a google search for you . :)

Java has String, StringBuffer and StringBuilder:
String : Its immutable
StringBuffer : Its Mutable and ThreadSafe
StringBuilder : Its Mutable but Not ThreadSafe, introduced in Java
1.5
String eg:
public class T1 {
public static void main(String[] args){
String s = "Hello";
for (int i=0;i<10;i++) {
s = s+"a";
System.out.println(s);
}
}
}
}
output: 10 Different Strings will be created instead of just 1 String.
Helloa
Helloaa
Helloaaa
Helloaaaa
Helloaaaaa
Helloaaaaaa
Helloaaaaaaa
Helloaaaaaaaa
Helloaaaaaaaaa
Helloaaaaaaaaaa
StringBuilder eg : Only 1 StringBuilder object will be created.
public class T1 {
public static void main(String[] args){
StringBuilder s = new StringBuilder("Hello");
for (int i=0;i<10;i++) {
s.append("a");
System.out.println(s);
}
}
}

Related

Surround string with another string

I there any utils method in Java that would enable me to surround a string with another string? Something like:
surround("hello","%");
which would return "%hello%"
I need just one method so the code would be nicer then adding prefix and suffix. Also I don't want to have a custom utils class if it's not necessary.
String.format can be used for this purpose:
String s = String.format("%%%s%%", "hello");
No but you can always create a method to do this:
public String surround(String str, String surroundingStr) {
StringBuffer buffer = new StringBuffer();
buffer.append(surroundingStr).append(str).append(surroundingStr);
return buffer.toString();
}
You have another method of doing it but Do not do this if you want better performance:-
public String surround(String str, String surroundingStr){
return surroundingStr + str + surroundingStr;
}
Why not use the second method?
As we all know, Strings in Java are immutable. When you concatinate strings thrice, it creates two new string objects apart from your original strings str and surroundingStr. And so a total of 4 string objects are created:
1. str
2. surroundingStr
3. surroundingStr + str
4. (surroundingStr + str) + surroundingStr
And creating of objects do take time. So for long run, the second method will downgrade your performance in terms of space and time. So it's your choice what method is to be used.
Though this is not the case after java 1.4
as concatinating strings with + operator uses StringBuffer in the background. So using the second method is not a problem if your Java version is 1.4 or above. But still, if you wanna concatinate strings is a loop, be careful.
My suggestion:
Either use StringBuffer of StringBuilder.
Not that i know of, but as already commented, its a single line piece of code that you could write yourself.
private String SurroundWord(String word, String surround){
return surround + word + surround;
}
Do note that this will return a New String object and not edit the original string.
Create a new method:
public String surround(String s, String surr){
return surr+s+surr;
}
Tested the following and returns %hello%
public static void main (String[] args) throws java.lang.Exception
{
System.out.println(surround("hello", "%"));
}
public static String surround(String s, String sign) {
return sign + s + sign;
}
StringUtils.wrap(str,wrapWith) is what you are looking for.
If apache common utils is already a part of dependency, then you can use it. Otherwise as others already mentioned. It's better to add to your base. Not a big deal
https://github.com/apache/commons-lang/blob/master/src/main/java/org/apache/commons/lang3/StringUtils.java

Connect multiple strings efficiently

I've got an interface IO, that offers two methods, void in(String in) and String out(). I've implemented that in a first, naive, version:
private String tmp="";
public void in(String in){
tmp=tmp+in;
}
public String out(){
return tmp;
}
I know this is an horrible implementation, if you have multiple, very long Strings. You need make a new String with length = tmp.length+in.length, copy tmp, copy in. And then repeat that again for evey inserted String. But what is an better implementation for that?
private List<String> tmp= new ArrayList<>() //maybe use an different list?
public void in(String in){
tmp.add(in);
}
public String out(){
return connect(tmp);
}
private String connect(List<String> l){
if(l.size()==1) return l.get(0);
List<String> half = new ArrayList<>();
for(int i=0;i<l.size();i+=2){
half.add(l.get(i)+l.get(i+1)); \\I have to check, if i+1 is valid, but this is just a draft ;)
}
return connect(half);
}
This is a bit better, it has to make the same number of String-connections, but the Strings are going to be smaller by averange. But it has an giant offset, and i'm not sure it's worth it. There schould be an easier option than this imho, too...
You may be looking for a StringBuilder.
private StringBuilder tmp = new StringBuilder();
public void in(String in) {
tmp.append(in);
}
public String out() {
return tmp.toString();
}
The standard library provides a class specifically for efficient string concatenation, the StringBuilder:
https://docs.oracle.com/javase/7/docs/api/java/lang/StringBuilder.html
Note that the compiler will actually desugar your string "additions" into expressions involving StringBuilders, and in a lot of simple/naïve cases it will also optimize the code to make use of the append() method instead of constantly creating new StringBuilders. In your case, however, it is definitely a good idea to explicitly use a StringBuilder.
As for your adventurous attempt at optimizing the concatenation, I honestly don't think you will notice any improvements over the naïve solution, and clean code is always preferable to "slightly faster code", unless clock cycles are extremely expensive.
From Java Doc:
If your text can change and will only be accessed from a single
thread, use a StringBuilder because StringBuilder is unsynchronized.
If your text can changes, and will be accessed from multiple threads,
use a StringBuffer because StringBuffer is synchronous.
In your case StringBuilder will work just fine.
The StringBuilder class should generally be used in preference to this
one, as it supports all of the same operations but it is faster, as it
performs no synchronization.
http://download.oracle.com/javase/6/docs/api/java/lang/StringBuffer.html

Usage StringBuffer and String?

As known String is immutable in Java. I have the following method's body which return String:
Partner partner = context.getComponent(ComponentNames.PARTNER_COMPONENT_NAME);
String lastAccesDate = partner.getLastAccessDate();
if(lastAccesDate == null) {
return "";
}
lastAccesDate = new SimpleDateFormat(DATE_PATTERN).format(); //1
return lastAccesDate;
The thing is because of string immutability, a new String object will be created at //1, so actually I'll have two String Objects, the first one contains partner.getLastAccessDate();, the second one new SimpleDateFormat(DATE_PATTERN).format();. The overhead is not good, how can I avoid it?
Use StringBuffer in case of multithreading(i.e. if you need a thread-safe, mutable sequence of character) otherwise use StringBuilder
see when you assign second time string to the String object lastAccessDate, there is no overhead as automaticaly garbage collector will free the space which occupied by first object because no object has reference to the same. so no need to worry about overhead

Strings - How do they work?

How do String objects work in Java? How does term "immutable" exactly apply to string objects? Why don't we see modified string after passing through some method, though we operate on original string object value?
a String has a private final char[] . when a new String object is created, the array is also created and filled. it cannot be later accessed [from outside] or modified [actually it can be done with reflection, but we'll leave this aside].
it is "immutable" because once a string is created, its value cannot be changed, a "cow" string will always have the value "cow".
We don't see modified string because it is immutable, the same object will never be changed, no matter what you do with it [besides modifying it with reflection]. so "cow" + " horse" will create a new String object, and NOT modify the last object.
if you define:
void foo(String arg) {
arg= arg + " horse";
}
and you call:
String str = "cow";
foo(str);
the str where the call is is not modified [since it is the original reference to the same object!] when you changed arg, you simply changed it to reference another String object, and did NOT change the actual original object. so str, will be the same object, which was not changed, still containing "cow"
if you still want to change a String object, you can do it with reflection. However, it is unadvised and can have some serious side-affects:
String str = "cow";
try {
Field value = str.getClass().getDeclaredField("value");
Field count = str.getClass().getDeclaredField("count");
Field hash = str.getClass().getDeclaredField("hash");
Field offset = str.getClass().getDeclaredField("offset");
value.setAccessible(true);
count.setAccessible(true);
hash.setAccessible(true);
offset.setAccessible(true);
char[] newVal = { 'c','o','w',' ','h','o','r','s','e' };
value.set(str,newVal);
count.set(str,newVal.length);
hash.set(str,0);
offset.set(str,0);
} catch (NoSuchFieldException e) {
} catch (IllegalAccessException e) {}
System.out.println(str);
}
From the tutorial:
The String class is immutable, so that once it is created a String object cannot be changed. The String class has a number of methods, some of which will be discussed below, that appear to modify strings. Since strings are immutable, what these methods really do is create and return a new string that contains the result of the operation.
Strings in Java are immutable (state cannot be modified once created). This offers opportunities for optimization. One example is string interning, where string literals are maintained in a string pool and new String objects are only created if the particular string literal doesn't already exist in the pool. If the string literal already exists, a reference is returned. This can only be accomplished because strings are immutable, so you don't have to worry that some object holding a reference will change it.
Methods that appear to modify a string actually return a new instance. One example is string concatenation:
String s = "";
for( int i = 0; i < 5; i++ ){
s = s + "hi";
}
What actually happens internally (the compiler changes it):
String s = "";
for( int i = 0; i < 5; i++ ){
StringBuffer sb = new StringBuffer();
sb.append(s);
sb.append("hi");
s = sb.toString();
}
You can clearly see that new instances are created by the toString method (note that this can be made more efficient by directly using StringBuffers). StringBuffers are mutable, unlike Strings.
Every object has state. The state of a String object is the array of characters that make up the String, for example, the String "foo" contains the array ['f', 'o', 'o']. Because a String is immutable, this array can never be changed in any way, shape, or form.
Every method in every class that wants to change a String must instead return a new String that represents the altered state of the old String. That is, if you try to reverse "foo" you will get a new String object with internal state ['o', 'o', 'f'].
I think this link will help you to understand how Java String really works
Now consider the following code -
String s = "ABC";
s.toLowerCase();
The method toLowerCase() will not change the data "ABC" that s contains. Instead, a new String object is instantiated and given the data "abc" during its construction. A reference to this String object is returned by the toLowerCase() method. To make the String s contain the data "abc", a different approach is needed.
Again consider the following - s = s.toLowerCase();
Now the String s references a new String object that contains "abc". There is nothing in the syntax of the declaration of the class String that enforces it as immutable; rather, none of the String class's methods ever affect the data that a String object contains, thus making it immutable.
I don't really understood your third question. May be providing a chunk of code and telling your problem is a better option. Hope this helps.
You can also look into this blogpost for more understanding
[code samples are taken from the wiki. you can also look in there for more information]

Is there any benefit in wrapping a `String` with a `new String (...)` before running its methods?

I saw a few old code snippets in a piece of software that no one remembers who wrote that instead of doing something like:
String abc = SOME_CONSTANT.toLowerCase()
They do:
String abc = new String(SOME_CONSTANT).toLowerCase()
I can't see any value in this - seems like plain old bad programming (e.g. not understanding that String is immutable). Anyone can see a good reason for this?
Note: SOME_CONSTANT is defined as -
public static final String SOME_CONSTANT = "Some value";
No, it just creates more objects (unless the compiler optimizes it away)
The only point in wrapping a String inside another String is to force a copy. e.g.
String str = "A very very log string .......";
// uses the same underlying string which we might not need after this.
String str1 = str.substring(0, 1);
// take a copy of which only has one char in it.
String str2 = new String(str1);
I would just do
public static final String SOME_CONSTANT = "Some value";
public static final String LOWER_CONSTANT = SOME_CONSTANT.toLowerCase();
I agree with you: it's bad programming
No good reason. As you said, String is immutable, so calling toLowerCase() on it will always produce a new string anyway.
new String(someString)
only makes sense in one important case:
String s = incredilyLongString.substring(1000,1005);
String t = new String(s);
assume incredilyLongString is 1000000 chars long (e.g. an XML file) and you just want 5 chars of it. The String s will still take at least ONE MEGABYTE of memory, but String t will be created from scratch and so will take up only the necessary memory.
I'm not sure, but I think when you use new String() you force the JVM to create a new object for that String. If you use SOME_CONSTANT.toLowerCase() JVM will search in the pool of Strings and just make a reference if there is a same String.
Maybe use new String() in this case can be a good practice just to makes clear that the toLowerCase() will affect just the new String generated, not the constant.
But anyway, the effect is the same

Categories

Resources