Looking for the best (?) way to replace contents of a StringBuilder
I normally use this
StringBuilder stringBuilder = new StringBuilder("abc");
stringBuilder.setLength(0);
stringBuilder.append("12");
I guess one could also point to a new StringBuilder
StringBuilder stringBuilder = new StringBuilder("abc");
stringBuilder = new StringBuilder("12");
Reusing a StringBuilder like this usually saves you little to nothing. Personally, I would never bother with these things; just create a new StringBuilder object, it's simple, consistent (you're not resetting your POJO's, right?).
So, I'd say, keep it simple, and go for the second option, unless you've got some interesting constraints not mentioned in your initial post.
Related
This question already has answers here:
Why can't strings be mutable in Java and .NET?
(17 answers)
Closed 7 years ago.
I've always wondered why does JAVA and C# has String (immutable & threadsafe) class, if they have StringBuilder (mutable & not threadsafe) or StringBuffer (mutable & threadsafe) class. Isn't StringBuilder/StringBuffer superset of String class? I mean, why should I use String class, if I've option of using StringBuilder/StringBuffer?
For example, Instead of using following,
String str;
Why can't I always use following?
StringBuilder strb; //or
StringBuffer strbu;
In short, my question is, How will my code get effected if I replace String with StringBuffer class? Also, StringBuffer has added advantage of mutability.
I mean, why should I use String class, if I've option of using StringBuilder/StringBuffer?
Precisely because it's immutable. Immutability has a whole host of benefits, primarily that it makes it much easier to reason about your code without creating copies of the data everywhere "just in case" something decides to mutate the value. For example:
private readonly String name;
public Person(string name)
{
if (string.IsNullOrEmpty(name)) // Or whatever
{
// Throw some exception
}
this.name = name;
}
// All the rest of the code can rely on name being a non-null
// reference to a non-empty string. Nothing can mutate it, leaving
// evil reflection aside.
Immutability makes sharing simple and efficient. That's particularly useful for multi-threaded code. It makes "modifying" (i.e. creating a new instance with different data) more painful, but in many situations that's absolutely fine, because values pass through the system without ever being modified.
Immutability is particularly useful for "simple" types such as strings, dates, numbers (BigDecimal, BigInteger etc). It allows them to be used within maps more easily, it allows a simple equality definition, etc.
1) StringBuilder as well as StringBuffer both are mutable. So it will cause a few problems like using in collections like keys in hashMap. See this link.
Another example of advantage of immutability will be what Jon has mentioned in his comments. I am just pasting here.
Someone can call Person p = new Person(builder); with a builder which initially passes my validation criteria - and then modify it afterwards, without the Person class having any say in it. In order to avoid that, the Person class would need to copy the validated data.
Immutabilty assures this does not happen.
2) As string is most extensively used object in java, the string pool offers to resuse same string, thus saving memory.
I completely agree with Jon Skeet that immutability is one reason to use String. Another reason (from a C# perspective) is that String is actually lighter weight than StringBuilder. If you look at reference source for both String and String Builder you will see that StringBuilder actually has a number of String constants in it. As a developer, you should only use what you need so unless you need the added benefits provided from StringBuilder you should use String.
Many answers have already outlined that there are shortcomings from using mutable variants such as StringBuilder. To illustrate the problem, one thing that you cannot achieve with StringBuilder is associative memory, i.e. hash tables. Sure, most implementations will allow you to use StringBuilder as a key for hashtables, but they will only find the values for the exact same instance of StringBuilder. However, the typical behavior that you would want to achieve is that it does not matter where the string comes from as only the characters are important, as you e.g. reade the string from a database or file (or any other external resource).
However, as far as I understood your question, you were mainly asking about field types. And indeed, I see your point particularly taking into account that we are doing the exact same thing with collections of other objects which are usually not immutable objects but mutable collections, such as List or ArrayList in C# or Java, respectively. In the end, a string is only a collection of characters, so why not making it mutable?
The answer I would give here is that the usual behavior of how such a string is changed is very different to usual collections. If you have a collection of subsequent elements, it is very common to only add a single element to the collection, leaving most of the collection untouched, i.e. you would not discard a list to insert an item, at least unless you are programming in Haskell :). For many strings like names, this is different as you typically replace the whole string. Given the importance of a string data type, the platforms usually offer a lot of optimization for strings such as interned strings, making the choice even more biased towards strings.
However, in the end, every program is different and you might have requirements that make it more reasonable to use StringBuilder by default, but for the given reasons, I think that these cases are rather rare.
EDIT: As you were asking for examples. Consider the following code:
stopwatch.Start();
var s = "";
for (int i = 0; i < 100000; i++)
{
s = "." + s;
}
stopwatch.Stop();
Console.WriteLine(stopwatch.ElapsedMilliseconds);
stopwatch.Restart();
var s2 = new StringBuilder();
for (int i = 0; i < 100000; i++)
{
s2.Insert(0, ".");
}
stopwatch.Stop();
Console.WriteLine(stopwatch.ElapsedMilliseconds);
Technically, both bits are doing a very similar thing, they will insert a character at the first position and shift whatever comes after. Both versions will involve copying the whole string that has been there before. The version with string completes in 1750ms on my machine whereas StringBuilder took 2245ms. However, both versions are reasonably fast, making the performance impact negligible in this case.
I would like to add some differences between String and StringBuilder classes:
Yes, as mentioned above String is immutable class and content cannot be changed after string has been created. It is allow to work with the same string objects from different threads without locking.
If you need to concatenate a lot of strings together, use StringBuilder class. When you use "+" operator it creates a lot of string objects on managed heap and hurts performance.
StringBuilder is mutable class. StringBuilder stores characters in array and can manipulate with characters without creating a new string object (such as add, remove, replace, append).
If you know approximate length of result string you should set capacity. Default capacity is 16 (.NET 4.5). It gives you performance improvements because StringBuilder has inner array of chars. Array of chars recreates when count of characters exceeds current capacity.
String:
is immutable (so you can use it in collections)
every operation creates a new instance on the Heap. Technically speaking really depends on the code.
For performance and memory consumption purposes it makes sense to use StringBuilder.
I'm using StringBuilder, instead of String, in my code in effort
to make the code time-efficient during all that parsing & concatenation.
But when i look into its source, the substring() method of
AbstractStringBuilder and thus StringBuilder is returning a String and not a StringBuilder.
What would be the idea behind this ?
Thanks.
The reason the substring method returns an immutable String is that once you get a part of the string inside your StringBuilder, it must make a copy. It cannot give you a mutable "live view" into the middle of StringBuilder's content, because otherwise you would run into conflicts of which changes to apply to what string.
Since a copy is to be made anyway, it might as well be immutable: you can easily make it mutable if you wish by constructing a StringBuilder around it, with full understanding that it is detached from the mutable original.
To go from one StringBuilder to another containing a segment of the original, you could use:
StringBuilder original = ...;
StringBuilder sub = new StringBuilder().append(original, offset, length);
This could have been provided as a method of original, but as things stand it isn't.
This aside, you should profile your code before engaging in micro-optimisations of this sort.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Consider a class with a "buildMessage" method (something like):
public class MessageBuilder {
public String buildMessage() {
//Build up the message and return
//
//Get the header
//Get the body
//Get the footer
//return
}
}
When we build up our message, it's preferable to build up strings with a StringBuilder (or similar buffering object) instead of just concating a bunch of strings together. But does that mean you lose this benefit returning String instead of taking your StringBuilder as an argument?
In other words, this reads nicely and is easy to understand:
private String getHeader() {
StringBuilder builder = new StringBuilder();
builder.append("Hello ")
.append(this.firstname)
.append(",\n");
return builder.toString();
}
That seems more natural to me than being forced to pass in a StringBuilder, but we could also write:
private void appendHeader(StringBuilder builder) {
builder.append("Hello ")
.append(this.firstname)
.append(",\n");
}
The first option makes it possible to use the "get" methods, even if the intent is not to append the returned value to a buffer. It also makes the public method easy to understand:
public class MessageBuilder {
public String buildMessage() {
StringBuilder builder = new StringBuilder();
builder.append(getHeader())
.append(getBody())
.append(getFooter());
return builder.toString();
}
}
While using the second option leads to:
public class MessageBuilder {
public String buildMessage() {
StringBuilder builder = new StringBuilder();
appendHeader(builder);
appendBody(builder);
appendFooter(builder);
return builder.toString();
}
}
My question is whether or not the first option suffers from the same memory issues as just "concating" + " strings" + " together". I'd be interested in hearing opinions about which reads better (because if there's a clear winner as to which one's cleaner and easier to read, that would weight heavily in it's favor), but I'm curious about the efficiency of it too. I suspect there's little to no difference, but wonder if anyone "knows" about the costs associated with each approach - if that's you, please share!
Reusing a StringBuilder is more efficient CPU wise. However, I doubt it will really matter.
You should do what you consider the most natural and clearest approach is. (Which appears to be StringBuilder from what you are saying)
Performance alone is less often a good reason to do something than many people think.
basically, with the first option, you are suffering from two distinct inefficiencies.
1). You are creating a stringBuilder object 3 additional times (so a total of 4 times), whereas in the second option it is only created once and reused. The creation of the object must be worth it, so usually you use stringBuilder when you are doing multiple string manipulation tasks on a string.
2). Because each header, body and footer return a string, in each case you are creating a normal string and copying it unnecessarily, this is defeating the purpose of using stringBuilder. Bascially, concating a normal string just recreates it from scratch, so you are suffering the same overheads that concating incurs when you create (as concating is just recreating from scratch as well).
You are probably better off creating a class, and in the constructor of the class you can create your string builder object so you don't need to do this in your upper level code. inside the class you can have the three methods, each one of them using the private string builder object when it builds and ultimately returns string. this way you can just make your calls without passing a string builder argument and without creating the object in your upper level code (the messy stuff is taken care of in the class automatically as per your code, this will make it easier to read)... :)
i hope this helps. if you would like me to explain further, do let me know.
also, you should only worry about these things if there are a large number of calls to these procedures on your server, and especially if this is happening in a For loop, and even more so if both these points are true or apply :).
cheers mate.
Just because I don't see it mentioned clearly in the other answers:
private String getHeader() {
StringBuilder builder = new StringBuilder();
builder.append("Hello ")
.append(this.firstname)
.append(",\n");
return builder.toString();
}
can be replaced with
private StringBuilder builder = new StringBuilder();
private String getHeader() {
builder.setLength(0);
builder.append("Hello ")
.append(this.firstname)
.append(",\n");
return builder.toString();
}
The buffer will be allocated once. When the buffer is extended, the extension is kept until the declaring object is garbage collected. So, no time lost in re-allocation and resizing of the buffer.
Only glitch is that you may need to make your method synchronous (i.e. thread safe).
Using an existing string buffer is faster than a string, since no allocation is required. The manual concatenation of strings is quite expensive, since multiple allocations are (often) required.
P.S.: Of course, the private allocation of the string builder can be shared between private methods, meaning the end user does not need to allocate it itself. In some cases, passing the buffer makes the code easier to read and more functional. There is no definite answer or rule here, it really depends of what you are doing and what you need.
Wishful thinking
This would be great - if StringBuilder wasn't a final class. Put the append... methods in an inner class that extends StringBuilder:
public class MessageBuilder {
public String buildMessage() {
return "" + MyStringBuilder()
.appendHeader()
.appendBody()
.appendFooter();
}
private class MyBuilder extends StringBuilder
{
MyBuilder appendHeader() {
append("Hello ")
.append(this.firstname)
.append(",\n");
}
}
An approach that actually works - with a lot more work - is to write a (non-final!) delegate class for StringBuilder, and implement private special-purpose extensions like this one as subclasses of it. See discussion here.
The last approach would be a true builder pattern if the "StringBuilder" would be hidden in an instance variable.
The reason to use that design pattern (the first examples are NOT the designpattern Builder) is to be able to make derived classes that overwrite the appendHeader / appendFooter and appendBody methods (e.g. you would want to build small HTML snippet.
If you only want to elegantly construct a string, then the fluent example (the second last one) looks the best, and is likely for younger programmers most easiest to understand.
All,
I was wondering if clearing a StringBuffer contents using the setLength(0) would make sense. i.e. Is it better to do :
while (<some condition>)
{
stringBufferVariable = new StringBuffer(128);
stringBufferVariable.append(<something>)
.append(<more>)
... ;
Append stringBufferVariable.toString() to a file;
stringBufferVariable.setLength(0);
}
My questions:
1 > Will this still have better performance than having a String object to append the contents?
I am not really sure how reinitializing the StringBuffer variable would affect the performance and hence the question.
Please pour in your comments
[edit]: Removed the 2nd question about comparing to StringBuilder since I have understood there nothing more to look into based on responses.
Better than concatenating strings?
If you're asking whether
stringBufferVariable.append("something")
.append("more");
...
will perform better than concatenating with +, then yes, usually. That's the whole reason these classes exist. Object creation is expensive compared to updating the values in a char array.
It appears most if not all compilers now convert string concatenation into using StringBuilder in simple cases such as str = "something" + "more" + "...";. The only performance difference I can then see is that the compiler won't have the advantage of setting the initial size. Benchmarks would tell you whether the difference is enough to matter. Using + would make for more readable code though.
From what I've read, the compiler apparently can't optimize concatenation done in a loop when it's something like
String str = "";
for (int i = 0; i < 10000; i++) {
str = str + i + ",";
}
so in those cases you would still want to explicitly use StringBuilder.
StringBuilder vs StringBuffer
StringBuilder is not thread-safe while StringBuffer is, but they are otherwise the same. The synchronization performed in StringBuffer makes it slower, so StringBuilder is faster and should be used unless you need the synchronization.
Should you use setLength?
The way your example is currently written I don't think the call to setLength gets you anything, since you're creating a new StringBuffer on each pass through the loop. What you should really do is
StringBuilder sb = new StringBuilder(128);
while (<some condition>) {
sb.append(<something>)
.append(<more>)
... ;
// Append stringBufferVariable.toString() to a file;
sb.setLength(0);
}
This avoids unnecessary object creation and setLength will only be updating an internal int variable in this case.
I'm just focusing on this part of the question. (The other parts have been asked and answered many times before on SO.)
I was wondering if clearing a StringBuffer contents using the setLength(0) would make sense.
It depends on the Java class libraries you are using. In some older Sun releases of Java, the StringBuffer.toString() was implemented on the assumption that a call to sb.toString() is the last thing that is done with the buffer. The StringBuffer's original backing array becomes part of the String returned by toString(). A subsequent attempt to use the StringBuffer resulted a new backing array being created and initialized by copying the String contents. Thus, reusing a StringBuffer as your code tries to do would actually make the application slower.
With Java 1.5 and later, a better way to code this is as follows:
bufferedWriter.append(stringBufferVariable);
stringBufferVariable.setLength(0);
This should copy the characters directly from the StringBuilder into the file buffer without any need to create a temporary String. Providing that the StringBuffer declaration is outside the loop, the setLength(0) then allows you to reuse the buffer.
Finally, you should only be worrying about all of this if you have evidence that this part of the code is (or is likely to be) a bottleneck. "Premature optimization is the root of all evil" blah, blah.
For question 2, StringBuilder will perform better than StringBuffer. StringBuffer is thread safe, meaning methods are synchronized. String Builder is not synchronized. So if the code you have is going to be run by ONE thread, then StringBuilder is going to have better performance since it does not have the overhead of doing synchronization.
As camickr suggest, please check out the API for StringBuffer and StringBuilder for more information.
Also you may be interested in this article: The Sad Tragedy of Micro-Optimization Theater
1 > Will this still have better performance than having a String object to append the contents?
Yes, concatenating Strings is slow since you keep creating new String Objects.
2 > If using StringBuilder would perform better than StringBuffer here, than why?
Have you read the API description for StringBuilder and/or StringBuffer? This issued is addressed there.
I am not really sure how reinitializing the StringBuffer variable would affect the performance and hence the question.
Well create a test program. Create a test that creates a new StringBuffer/Builder every time. Then rerun the test and just reset the characters to 0 and then compare the times.
Perhaps I am misunderstanding something in your question... why are you setting the length to 0 at the bottom if you are just creating a new one at the start of each iteration?
Assuming the variable is a local variable to the method or that it will not be using by multiple threads if it is declared outside of a method (if it is outside of a method your code probably has issues though) then make it a StringBuilder.
If you declare the StringBuilder outside of the loop then you don't need to make a new one each time you enter the loop but you would want to set the length to 0 at the end of the loop.
If you declare the StringBuilder inside of the loop then you don't need to set the length to 0 at the end of the loop.
It is likely that declaring it outside of the loop and setting the length to 0 will be faster, but I would measure both and if there isn't a large difference declare the variable inside the loop. It is good practice to limit the scope of variables.
yup! setLength(0) is a great idea! that's what its for. anything quicker would be to discard the stringBuffer & make a new one. its faster, can't say anything about it being memory efficient :)
i am trying to do a simple string manipulation. input is "murder", i want to get "murderredrum".
i tried this
String str = "murder";
StringBuffer buf = new StringBuffer(str);
// buf is now "murder", so i append the reverse which is "redrum"
buf.append(buf.reverse());
System.out.println(buf);
but now i get "redrumredrum" instead of "murderredrum".
can someone explain what's wrong with my program? thank you.
The short answer
The line:
buf.append(buf.reverse());
essentially does the following:
buf.reverse(); // buf is now "redrum"
buf.append(buf);
This is why you get "redrumredrum".
That is, buf.reverse() doesn't return a new StringBuffer which is the reverse of buf. It returns buf, after it had reversed itself!
There are many ways to "fix" this, but the easiest would be to explicitly create a new StringBuffer for the reversal, so something like this:
buf.append(new StringBuffer(str).reverse());
Deeper insight: comparing String and StringBuffer
String in Java is immutable. On the other hand, StringBuffer is mutable (which is why you can, among other things, append things to it).
This is why with String, a transforming method really returns a new String. This is why something like this is "wrong"
String str = "murder";
str.toUpperCase(); // this is "wrong"!!!
System.out.println(str); // still "murder"
Instead you want to do:
String str = "murder";
str = str.toUpperCase(); // YES!!!
System.out.println(str); // now "MURDER"!!!
However, the situation is far from analogous with StringBuffer. Most StringBuffer methods do return StringBuffer, but they return the same instance that it was invoked on! They do NOT return a new StringBuffer instance. In fact, you're free to discard the "result", because these methods have already accomplished what they do through various mutations (i.e. side effects) to the instance it's invoked upon.
These methods could've been declared as void, but the reason why they essentially return this; instead is because it facilitates method chaining, allowing you to write something like:
sb.append(thisThing).append(thatThing).append(oneMoreForGoodMeasure);
Related questions
Method chaining - why is it a good practice, or not?
Fluent Interfaces - Method Chaining
Appendix: StringBuffer vs StringBuilder
Instead of StringBuffer, you should generally prefer StringBuilder, which is faster because it's not synchronized. Most of the discussions above also applies to StringBuilder.
From the documentation:
StringBuffer : A thread-safe, mutable sequence of characters. [...] As of JDK 5, this class has been supplemented with an equivalent class designed for use by a single thread, StringBuilder, which should generally be preferred as it supports all of the same operations but faster, as it performs no synchronization.
StringBuilder : A mutable sequence of characters. [...] Instances of StringBuilder are not safe for use by multiple threads. If such synchronization is required then it is recommended that StringBuffer be used.
Related questions
StringBuilder and StringBuffer in Java
Bonus material! Alternative solution!
Here's an alternative "fix" to the problem that is perhaps more readable:
StringBuilder word = new StringBuilder("murder");
StringBuilder worddrow = new StringBuilder(); // starts empty
worddrow.append(word).append(word.reverse());
System.out.println(worddrow); // "murderredrum"
Note that while this should do fine for short strings, it does use an extra buffer which means that it's not the most efficient way to solve the problem.
Related questions
Reverse a string in Java, in O(1)? - as a CharSequence, yes this can be done!
Bonus material again! The last laugh!
StringBuilder sb = new StringBuilder("ha");
sb.append(sb.append(sb));
System.out.println(sb); // "hahahaha"
buf.reverse() gets called first it modifies the stringbuffer to redrum. Now you are appending redrum to redrum