Java: String concat vs StringBuilder - optimised, so what should I do? - java

In this answer, it says (implies) that String concatenation is optimised into StringBuilder operations anyway, so when I write my code, is there any reason to write StringBuilder code in the source? Note that my use case is different from the OP's question, as I am concatenating/appending hundreds-thousands of lines.
To make myself clearer: I am well-aware about the differences of each, it's just that I don't know if it's worth actually writing StringBuilder code because it's less readable and when its supposedly slower cousin, the String class, is converted automagically in the compilation process anyway.

I think the use of StringBuilder vs + really depends on the context you are using it in.
Generally using JDK 1.6 and above the compiler will automatically join strings together using StringBuilder.
String one = "abc";
String two = "xyz";
String three = one + two;
This will compile String three as:
String three = new StringBuilder().append(one).append(two).toString();
This is quite helpful and saves us some runtime. However this process is not always optimal. Take for example:
String out = "";
for( int i = 0; i < 10000 ; i++ ) {
out = out + i;
}
return out;
If we compile to bytecode and then decompile the bytecode generated we get something like:
String out = "";
for( int i = 0; i < 10000; i++ ) {
out = new StringBuilder().append(out).append(i).toString();
}
return out;
The compiler has optimised the inner loop but certainly has not made the best possible optimisations. To improve our code we could use:
StringBuilder out = new StringBuilder();
for( int i = 0 ; i < 10000; i++ ) {
out.append(i);
}
return out.toString();
Now this is more optimal than the compiler generated code, so there is definitely a need to write code using the StringBuilder/StringBuffer classes in cases where efficient code is needed. The current compilers are not great at dealing concatenating strings in loops, however this could change in the future.
You need to look carefully to see where you need to manually apply StringBuilder and try to use it where it will not reduce readability of your code too.
Note: I compiled code using JDK 1.6, and and decompiled the code using the javap program, which spits out byte code. It is fairly easy to interpret and is often a useful reference to look at when trying to optimise code. The compiler does change you code behind the scenes so it is always interesting to see what it does!

The key phrase in your question is "supposedly slower". You need to identify if this is indeed a bottleneck, and then see which is faster.
If you are about to write this code, but have not written it yet, then write whatever is clearer to you and then if necessay see if it is a bottleneck.
While it makes sense to use the code you consider more likey to be faster, if both are equally readable, actually taking time to find out which is faster when you don't have a need is a waste of time. Readability above performance until performance is unacceptable.

It depends on the case, but StringBuilder is thought to be a bit faster. If you are doing concatenation inside a loop then I would suggest you use StringBuilder.
Anyway, I would advise you to profile and benchmark your code (if you are doing such a massive append).
Be careful though: instances of StringBuilder are mutable and are not to be shared between threads (unless you really know what you are doing.) as opposed to String which are immutable.

I might've misunderstood your question, but StringBuilder is faster when appending Strings. So, yes, if you are appending "hundreds-thousands of lines", you definitely should use StringBuilder (or StringBuffer if you are running a multithreaded app).
(More thorough answer in comments)

Related

What is the difference between concatenation using StringBuilder append() method and String "+" operator? [duplicate]

This question already has answers here:
Java: String concat vs StringBuilder - optimised, so what should I do?
(4 answers)
StringBuilder vs String concatenation in toString() in Java
(20 answers)
Why StringBuilder when there is String?
(9 answers)
Closed 2 months ago.
Given the following two code snippets:
StringBuilder sb = new StringBuilder("");
for (int i = 0; i < 5; i++) {
sb.append("hello ");
}
System.out.println(sb.toString());
and
String ss = "";
for (int i = 0; i < 5; i++) {
ss += "hello ";
}
System.out.println(ss);
I read that string concatenation using +, uses StringBuffer or StringBuilder's append method. As far as I know, StringBuilder is mutable, so any modification should not create a new instance. String is immutable, so I expect concatenation in a loop should create a new instance every time (5 times over here). But if + is using StringBuffer or StringBuilder's append method, is it really creating a new instance of ss each time in the loop? Also which among the the code snippets will be more efficient?
It used to be the case that some Java bytecode compilers would compile the String concatenation operator (+) to operations on a temporary StringBuilder.
However:
This was an implementation detail. Though the JLS (in some versions1) said that it may be optimized in that way, it was never stated that it shall be.
This is not how more recent OpenJDK Java compilers deal with this. Concatenation expressions are now compiled by the bytecode compiler to invokedynamic calls ... which are then optimized by the JIT compiler.
However, your example involves concatenation in a loop. I don't think that JIT compiler in current versions of Java can optimize across multiple loop iterations.
But, that may change too. Indeed, JEP 280 hints that the "indifying" of string concatenations by the bytecode compiler will enable further JIT compiler optimization. That could include optimization of loops like the 2nd version of your example. And if it does change, "hand optimization" to use StringBuilder calls in your source code could actually interfere with the JIT compiler's ability to optimize.
My advice: avoid premature / unnecessary optimization. Unless your application-level profiling tells you that a particular concatenation sequence is a significant bottleneck, leave it alone.
1 - For example, the Java 18 JLS says: "An implementation may choose to perform conversion and concatenation in one step to avoid creating and then discarding an intermediate String object. To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression." JLS 15.18.1.
I'd always use StringBuilder to concatenate strings in a loop. Doing so has several advantages:
you are using StringBuilder what is made to do, so it is easier to read (but less concise);
you can pass StringBuilder around using method arguments (if you'd use the immutable String you can only use it as return value);
StringBuilder has additional options to alter the string if that would be necessary;
it will allocate some buffer space in advance, so that subsequent alterations / concatenations don't take that much memory;
it can be assumed to be relatively performant whichever direction the current JVM takes when it comes to string concatenation.
Of course, I'd use new StringBuilder() instead of the new StringBuilder("") which makes no sense - StringBuilder instances will already start with an empty string (and 16 characters of buffer space).
If you know the size of the string in advance then it does make sense to create a buffer large enough to hold it, so it doesn't need to request a larger area to hold the characters. Memory management is relatively expensive.
With regard to that, note that StringBuilder implements CharSequence so you can actually use it instead of a String for some methods if you want to avoid copies.
"Concatenate" joins two specific items together, whereas "append " adds what you specify to whatever may already be there

String.format() vs string concatenation performance

is there any difference in performance between these two idioms ?
String firstStr = "Hello ";
String secStr = "world";
String third = firstStr + secStr;
and
String firstStr = "Hello ";
String secStr = "world";
String third = String.format("%s%s",firstStr , secStr);
I know that concatenation with + operator is bad for performance specially if the operation is done a lot of times, but what about String.format() ? is it the same or it can help to improve performance?
The second one will be even slower (if you look at the source code of String.format() you will see why). It is just because String.format() executes much more code than the simple concatenation. And at the end of the day, both code versions create 3 instances of String. There are other reasons, not performance related, to use String.format(), as others already pointed out.
First of all, let me just put a disclaimer against premature optimization. Unless you are reasonably sure this is going to be a hotspot in your program, just choose the construct that fits your program the best.
If you are reasonably sure, however, and want to have good control over the concatenation, just use a StringBuilder directly. That's what the built-in concatenation operation does anyway, and there's no reason to assume that it's slow. As long as you keep the same StringBuilder and keep appending to it, rather than risking creating several in a row (which would have to be "initialized" with the previously created data), you'll have proper O(n) performance. Especially so if you make sure to initialize the StringBuilder with proper capacity.
That also said, however, a StringBuilder is, as mentioned, what the built-in concatenation operation uses anyway, so if you just keep all your concatenations "inline" -- that is, use A + B + C + D, rather than something like e = A + B followed by f = C + D followed by e + f (this way, the same StringBuilder is used and appended to throughout the entire operation) -- then there's no reason to assume it would be slow.
EDIT: In reply to your comment, I'd say that String.format is always slower. Even if it appends optimally, it can't do it any faster than StringBuilder (and therefore also the concatenation operation) does anyway, but it also has to create a Formatter object, do parsing of the input string, and so on. So there's more to it, but it still can't do the basic operation any faster.
Also, if you look internally in how the Formatter works, you'll find that it also (by default) uses a StringBuilder, just as the concatenation operation does. Therefore, it does the exact same basic operation -- that is, feeding a StringBuilder with the strings you give it. It just does it in a far more roundabout way.
As described in this excellent answer , you would rather use String.format, but mainly because of localization concerns.
Suppose that you had to supply different text for different languages, in that case then using String.format - you can just plug in the new languages(using resource files). But concatenation leaves messy code.
See : Is it better practice to use String.format over string Concatenation in Java?

Is chain of StringBuilder.append more efficient than string concatenation?

According to Netbeans hint named Use chain of .append methods instead of string concatenation
Looks for string concatenation in the parameter of an invocation of the append method of StringBuilder or StringBuffer.
Is StringBuilder.append() really more efficient than strings concatenation?
Code sample
StringBuilder sb = new StringBuilder();
sb.append(filename + "/");
vs.
StringBuilder sb = new StringBuilder();
sb.append(filename).append("/");
You have to balance readability with functionality.
Let's say you have the following:
String str = "foo";
str += "bar";
if(baz) str += "baz";
This will create 2 string builders (where you only need 1, really) plus an additional string object for the interim. You would be more efficient if you went:
StringBuilder strBuilder = new StringBuilder("foo");
strBuilder.append("bar");
if(baz) strBuilder.append("baz");
String str = strBuilder.toString();
But as a matter of style, I think the first one looks just fine. The performance benefit of a single object creation seems very minimal to me. Now, if instead of 3 strings, you had 10, or 20, or 100, I would say the performance outweighs the style. If it was in a loop, for sure I'd use the string builder, but I think just a couple strings is fine to do the 'sloppy' way to make the code look cleaner. But... this has a very dangerous trap lurking in it! Read on below (pause to build suspense... dun dun dunnnn)
There are those who say to always use the explicit string builder. One rationale is that your code will continue to grow, and it will usually do so in the same manner as it is already (i.e. they won't take the time to refactor.) So you end up with those 10 or 20 statements each creating their own builder when you don't need to. So to prevent this from the start, they say always use an explicit builder.
So while in your example, it's not going to be particularly faster, when someone in the future decides they want a file extension on the end, or something like that, if they continue to use string concatenation instead of a StringBuilder, they're going to run into performance problems eventually.
We also need to think about the future. Let's say you were making Java code back in JDK 1.1 and you had the following method:
public String concat(String s1, String s2, String s3) {
return s1 + s2 + s3;
}
At that time, it would have been slow because StringBuilder didn't exist.
Then in JDK 1.3 you decided to make it faster by using StringBuffer (StringBuilder still doesn't exist yet). You do this:
public String concat(String s1, String s2, String s3) {
StringBuffer sb = new StringBuffer();
sb.append(s1);
sb.append(s2);
sb.append(s3);
return sb.toString();
}
It gets a lot faster. Awesome!
Now JDK 1.5 comes out, and with it comes StringBuilder (which is faster than StringBuffer) and the automatic transation of
return s1 + s2 + s3;
to
return new StringBuilder().append(s1).append(s2).append(s3).toString();
But you don't get this performance benefit because you're using StringBuffer explicitly. So by being smart, you have caused a performance hit when Java got smarter than you. So you have to keep in mind that there are things out there you won't think of.
Well, your first example is essentially translated by the compiler into something along the lines:
StringBuilder sb = new StringBuilder();
sb.append(new StringBuilder().append(filename).append("/").toString());
so yes, there is a certain inefficiency here. However, whether it really matters in your program is a different question. Aside from being questionable style (hint: subjective), it usually only matters, if you are doing this in a tight loop.
None of the answers so far explicitly address the specific case that hint is for. It's not saying to always use StringBuilder#append instead of concatenation. But, if you're already using a StringBuilder, it doesn't make sense to mix in concatenation, because it creates a redundant StringBuilder (See Dirk's answer) and an unnecessary temporary String instance.
Several answers already discuss why the suggested way is more efficient, but the main point is, if you already have a StringBuilder instance, just call append on it. It's just as readable (in my opinion, and apparently whoever wrote the NetBeans hint) since you're calling append anyway, and it's a little more efficient.
Theoretically, yes. Because String objects are immutable: once constructed they cannot be changed anymore. So using "+" (concatenation) basically creates a new object each time.
Practically no. The compiler is clever enough to replace all your "+" with StringBuilder appendings.
For a more detailed explanation:
http://kaioa.com/node/59
PS: Netbeans??? Come on!
A concat of two strings is faster using this function.
However, if you have multiple strings or different data type, you should use a StringBuilder either explicitly or implicitly. Using a + with Strings is using a StringBuilder implicitly.
It's only more efficient if you are using lots of concatenation and really long strings. For general-use, such as creating a filename in your example, any string concatenation is just fine and more readable.
At any rate, this part of your application is unlikely to be the performance bottleneck.

When to use StringBuilder in Java [duplicate]

This question already has answers here:
StringBuilder vs String concatenation in toString() in Java
(20 answers)
Closed 8 years ago.
It is supposed to be generally preferable to use a StringBuilder for string concatenation in Java. Is this always the case?
What I mean is this: Is the overhead of creating a StringBuilder object, calling the append() method and finally toString() already smaller then concatenating existing strings with the + operator for two strings, or is it only advisable for more (than two) strings?
If there is such a threshold, what does it depend on (perhaps the string length, but in which way)?
And finally, would you trade the readability and conciseness of the + concatenation for the performance of the StringBuilder in smaller cases like two, three or four strings?
Explicit use of StringBuilder for regular concatenations is being mentioned as obsolete at obsolete Java optimization tips as well as at Java urban myths.
If you use String concatenation in a loop, something like this,
String s = "";
for (int i = 0; i < 100; i++) {
s += ", " + i;
}
then you should use a StringBuilder (not StringBuffer) instead of a String, because it is much faster and consumes less memory.
If you have a single statement,
String s = "1, " + "2, " + "3, " + "4, " ...;
then you can use Strings, because the compiler will use StringBuilder automatically.
Ralph's answer is fabulous. I would rather use StringBuilder class to build/decorate the String because the usage of it is more look like Builder pattern.
public String decorateTheString(String orgStr){
StringBuilder builder = new StringBuilder();
builder.append(orgStr);
builder.deleteCharAt(orgStr.length()-1);
builder.insert(0,builder.hashCode());
return builder.toString();
}
It can be use as a helper/builder to build the String, not the String itself.
As a general rule, always use the more readable code and only refactor if performance is an issue. In this specific case, most recent JDK's will actually optimize the code into the StringBuilder version in any case.
You usually only really need to do it manually if you are doing string concatenation in a loop or in some complex code that the compiler can't easily optimize.
Have a look at: http://www.javaspecialists.eu/archive/Issue068.html and http://www.javaspecialists.eu/archive/Issue105.html
Do the same tests in your environment and check if newer JDK or your Java implementation do some type of string operation better with String or better with StringBuilder.
Some compilers may not replace any string concatenations with StringBuilder equivalents. Be sure to consider which compilers your source will use before relying on compile time optimizations.
The + operator uses public String concat(String str) internally. This method copies the characters of the two strings, so it has memory requirements and runtime complexity proportional to the length of the two strings. StringBuilder works more efficent.
However I have read here that the concatination code using the + operater is changed to StringBuilder on post Java 4 compilers. So this might not be an issue at all. (Though I would really check this statement if I depend on it in my code!)
For two strings concat is faster, in other cases StringBuilder is a better choice, see my explanation in concatenation operator (+) vs concat()
The problem with String concatenation is that it leads to copying of the String object with all the associated cost. StringBuilder is not threadsafe and is therefore faster than StringBuffer, which used to be the preferred choice before Java 5. As a rule of thumb, you should not do String concatenation in a loop, which will be called often. I guess doing a few concatenations here and there will not hurt you as long as you are not talking about hundreds and this of course depends on your performance requirements. If you are doing real time stuff, you should be very careful.
The Microsoft certification material addresses this same question. In the .NET world, the overhead for the StringBuilder object makes a simple concatenation of 2 String objects more efficient. I would assume a similar answer for Java strings.

Strings are immutable - that means I should never use += and only StringBuffer?

Strings are immutable, meaning, once they have been created they cannot be changed.
So, does this mean that it would take more memory if you append things with += than if you created a StringBuffer and appended text to that?
If you use +=, you would create a new 'object' each time that has to be saved in the memory, wouldn't you?
Yes, you will create a new object each time with +=. That doesn't mean it's always the wrong thing to do, however. It depends whether you want that value as a string, or whether you're just going to use it to build the string up further.
If you actually want the result of x + y as a string, then you might as well just use string concatenation. However, if you're really going to (say) loop round and append another string, and another, etc - only needing the result as a string at the very end, then StringBuffer/StringBuilder are the way to go. Indeed, looping is really where StringBuilder pays off over string concatenation - the performance difference for 5 or even 10 direct concatenations is going to be quite small, but for thousands it becomes a lot worse - basically because you get O(N2) complexity with concatenation vs O(N) complexity with StringBuilder.
In Java 5 and above, you should basically use StringBuilder - it's unsynchronized, but that's almost always okay; it's very rare to want to share one between threads.
I have an article on all of this which you might find useful.
Rule of thumb is simple:
If you are running concatenations in a loop, don't use +=
If you are not running concatenations in a loop, using += simply does not matter. (Unless a performance critical application
In Java 5 or later, StringBuffer is thread safe, and so has some overhead that you shouldn't pay for unless you need it. StringBuilder has the same API but is not thread safe (i.e. you should only use it internal to a single thread).
Yes, if you are building up large strings, it is more efficient to use StringBuilder. It is probably not worth it to pass StringBuilder or StringBuffer around as part of your API. This is too confusing.
I agree with all the answers posted above, but it will help you a little bit to understand more about the way Java is implemented. The JVM uses StringBuffers internally to compile the String + operator (From the StringBuffer Javadoc):
String buffers are used by the
compiler to implement the binary
string concatenation operator +. For
example, the code:
x = "a" + 4 + "c"
is compiled to the equivalent of:
x = new StringBuffer().append("a").append(4).append("c")
.toString()
Likewise, x += "some new string" is equivalent to x = x + "some new string". Do you see where I'm going with this?
If you are doing a lot of String concatenations, using StringBuffer will increase your performance, but if you're only doing a couple of simple String concatenations, the Java compiler will probably optimize it for you, and you won't notice a difference in performance
Yes. String is immutable. For occasional use, += is OK. If the += operation is intensive, you should turn to StringBuilder.
But the garbage collector will end up freeing the old strings once there are no references to them
Exactly. You should use a StringBuilder though if thread-safety isn't an issue.
As a side note: There might be several String objects using the same backing char[] - for instance whenever you use substring(), no new char[] will be created which makes using it quite efficient.
Additionally, compilers may do some optimization for you. For instance if you do
static final String FOO = "foo";
static final String BAR = "bar";
String getFoobar() {
return FOO + BAR; // no string concatenation at runtime
}
I wouldn't be surprised if the compiler would use StringBuilder internally to optimize String concatenation where possible - if not already maybe in the future.
I think it relies on the GC to collect the memory with the abandoned string.
So doing += with string builder will be definitely faster if you have a lot of operation on string manipulation. But it's shouldn't a problem for most cases.
Yes you would and that is exactly why you should use StringBuffer to concatenate alot of Strings.
Also note that since Java 5 you should also prefer StringBuilder most of the time. It's just some sort of unsynchronized StringBuffer.
You're right that Strings are immutable, so if you're trying to conserve memory while doing a lot of string concatenation, you should use StringBuilder rather than +=.
However, you may not mind. Programs are written for their human readers, so you can go with clarity. If it's important that you optimize, you should profile first. Unless your program is very heavily weighted toward string activity, there will probably be other bottlenecks.
No
It will not use more memory. Yes, new objects are created, but the old ones are recycled. In the end, the amount of memory used is the same.

Categories

Resources