Is it always a bad idea to use + to concatenate strings - java

I have code as follows :
String s = "";
for (My my : myList) {
s += my.getX();
}
Findbugs always reports error when I do this.

I would use + if you are manually concatenating,
String word = "Hello";
word += " World!";
However, if you are iterating and concatenating I would suggest StringBuilder,
StringBuilder sb = new StringBuilder();
for (My my : myList) {
sb.append(my.getX());
}

The String object is immutable in Java. Each + means another object. You could use StringBuffer to minimize the amount of created objects.

Each time you do string+=string, it calls method like this:
private String(String s1, String s2) {
if (s1 == null) {
s1 = "null";
}
if (s2 == null) {
s2 = "null";
}
count = s1.count + s2.count;
value = new char[count];
offset = 0;
System.arraycopy(s1.value, s1.offset, value, 0, s1.count);
System.arraycopy(s2.value, s2.offset, value, s1.count, s2.count);
}
In case of StringBuilder, it comes to:
final void append0(String string) {
if (string == null) {
appendNull();
return;
}
int adding = string.length();
int newSize = count + adding;
if (newSize > value.length) {
enlargeBuffer(newSize);
}
string.getChars(0, adding, value, count);
count = newSize;
}
As you can clearly conclude, string + string creates a lot of overhead, and in my opinion should be avoided if possible. If you think using StringBuilder is bulky or to long you can just make a method and use it indirectly, like:
public static String scat(String... vargs) {
StringBuilder sb = new StringBuilder();
for (String str : vargs)
sb.append(str);
return sb.toString();
}
And use it like:
String abcd = scat("a","b","c","d");
In C# I'm told its about as same as string.Concat();. In your case it would be wise to write overload for scat, like:
public static String scat(Collection<?> vargs) {
StringBuilder sb = new StringBuilder();
for (Object str : vargs)
sb.append(str);
return sb.toString();
}
Then you can call it with:
result = scat(myList)

The compiler can optimize some thing such as
"foo"+"bar"
To
StringBuilder s1=new StringBuilder();
s1.append("foo").append("bar");
However this is still suboptimal since it starts with a default size of 16. As with many things though you should find your biggest bottle necks and work your way down the list. It doesn't hurt to be in the habbit of using a SB pattern from the get go though, especially if you're able to calculate an optimal initialization size.

Premature optimization can be bad as well as it often reduces readability and is usually completely unnecessary. Use + if it is more readable unless you actually have an overriding concern.

It is not 'always bad' to use "+". Using StringBuffer everywhere can make code really bulky.
If someone put a lot of "+" in the middle of an intensive, time-critical loop, I'd be annoyed. If someone put a lot of "+" in a rarely-used piece of code I would not care.

I would say use plus in the following:
String c = "a" + "b"
And use StringBuilder class everywhere else.
As already mentioned in the first case it will be optimized by the compiler and it's more readable.

One of the reasons why FindBugs should argue about using concatenation operator (be it "+" or "+=") is localizability. In the example you gave it is not so apparent, but in case of the following code it is:
String result = "Scanning found " + Integer.toString(numberOfViruses) + " viruses";
If this looks somewhat familiar, you need to change your coding style. The problem is, it will sound great in English, but it could be a nightmare for translators. That's just because you cannot guarantee that order of the sentence will still be the same after translation – some languages will be translated to "1 blah blah", some to "blah blah 3". In such cases you should always use MessageFormat.format() to build compound sentences and using concatenation operator is clearly internationalization bug.
BTW. I put another i18n defect here, could you spot it?

The running time of concatenation of two strings is proportional to the length of the strings. If it is used in a loop running time is always increasing. So if concatenation is needed in a loop its better to use StringBuilder like Anthony suggested.

Related

Alternative to successive String.replace

I want to replace some strings in a String input :
string=string.replace("<h1>","<big><big><big><b>");
string=string.replace("</h1>","</b></big></big></big>");
string=string.replace("<h2>","<big><big>");
string=string.replace("</h2>","</big></big>");
string=string.replace("<h3>","<big>");
string=string.replace("</h3>","</big>");
string=string.replace("<h4>","<b>");
string=string.replace("</h4>","</b>");
string=string.replace("<h5>","<small><b>");
string=string.replace("</h5>","</b><small>");
string=string.replace("<h6>","<small>");
string=string.replace("</h6>","</small>");
As you can see this approach is not the best, because each time I have to search for the portion to replace etc, and Strings are immutable... Also the input is large, which means that some performance issues are to be considered.
Is there any better approach to reduce the complexity of this code ?
Although StringBuilder.replace() is a huge improvement compared to String.replace(), it is still very far from being optimal.
The problem with StringBuilder.replace() is that if the replacement has different length than the replaceable part (applies to our case), a bigger internal char array might have to be allocated, and the content has to be copied, and then the replace will occur (which also involves copying).
Imagine this: You have a text with 10.000 characters. If you want to replace the "XY" substring found at position 1 (2nd character) to "ABC", the implementation has to reallocate a char buffer which is at least larger by 1, has to copy the old content to the new array, and it has to copy 9.997 characters (starting at position 3) to the right by 1 to fit "ABC" into the place of "XY", and finally characters of "ABC" are copied to the starter position 1. This has to be done for every replace! This is slow.
Faster Solution: Building Output On-The-Fly
We can build the output on-the-fly: parts that don't contain replaceable texts can simply be appended to the output, and if we find a replaceable fragment, we append the replacement instead of it. Theoretically it's enough to loop over the input only once to generate the output. Sounds simple, and it's not that hard to implement it.
Implementation:
We will use a Map preloaded with mappings of the replaceable-replacement strings:
Map<String, String> map = new HashMap<>();
map.put("<h1>", "<big><big><big><b>");
map.put("</h1>", "</b></big></big></big>");
map.put("<h2>", "<big><big>");
map.put("</h2>", "</big></big>");
map.put("<h3>", "<big>");
map.put("</h3>", "</big>");
map.put("<h4>", "<b>");
map.put("</h4>", "</b>");
map.put("<h5>", "<small><b>");
map.put("</h5>", "</b></small>");
map.put("<h6>", "<small>");
map.put("</h6>", "</small>");
And using this, here is the replacer code: (more explanation after the code)
public static String replaceTags(String src, Map<String, String> map) {
StringBuilder sb = new StringBuilder(src.length() + src.length() / 2);
for (int pos = 0;;) {
int ltIdx = src.indexOf('<', pos);
if (ltIdx < 0) {
// No more '<', we're done:
sb.append(src, pos, src.length());
return sb.toString();
}
sb.append(src, pos, ltIdx); // Copy chars before '<'
// Check if our hit is replaceable:
boolean mismatch = true;
for (Entry<String, String> e : map.entrySet()) {
String key = e.getKey();
if (src.regionMatches(ltIdx, key, 0, key.length())) {
// Match, append the replacement:
sb.append(e.getValue());
pos = ltIdx + key.length();
mismatch = false;
break;
}
}
if (mismatch) {
sb.append('<');
pos = ltIdx + 1;
}
}
}
Testing it:
String in = "Yo<h1>TITLE</h1><h3>Hi!</h3>Nice day.<h6>Hi back!</h6>End";
System.out.println(in);
System.out.println(replaceTags(in, map));
Output: (wrapped to avoid scroll bar)
Yo<h1>TITLE</h1><h3>Hi!</h3>Nice day.<h6>Hi back!</h6>End
Yo<big><big><big><b>TITLE</b></big></big></big><big>Hi!</big>Nice day.
<small>Hi back!</small>End
This solution is faster than using regular expressions as that involves much overhead, like compiling a Pattern, creating a Matcher etc. and regexp is also much more general. It also creates many temporary objects under the hood which are thrown away after the replace. Here I only use a StringBuilder (plus char array under its hood) and the code iterates over the input String only once. Also this solution is much faster that using StringBuilder.replace() as detailed at the top of this answer.
Notes and Explanation
I initialized the StringBuilder in the replaceTags() method like this:
StringBuilder sb = new StringBuilder(src.length() + src.length() / 2);
So basically I created it with an initial capacity of 150% of the length of the original String. This is because our replacements are longer than the replaceable texts, so if replacing occurs, the output will obviously be longer than the input. Giving a larger initial capacity to StringBuilder will result in no internal char[] reallocation at all (of course the required initial capacity depends on the replaceable-replacement pairs and their frequency/occurrence in the input, but this +50% is a good upper estimation).
I also utilized the fact that all replaceable strings start with a '<' character, so finding the next potential replaceable position becomes blazing-fast:
int ltIdx = src.indexOf('<', pos);
It's just a simple loop and char comparisons inside String, and since it always starts searching from pos (and not from the start of the input), overall the code iterates over the input String only once.
And finally to tell if a replaceable String does occur at the potential position, we use the String.regionMatches() method to check the replaceable stings which is also blazing-fast as all it does is just compares char values in a loop and returns at the very first mismatching character.
And a PLUS:
The question doesn't mention it, but our input is an HTML document. HTML tags are case-insensitive which means the input might contain <H1> instead of <h1>.
To this algorithm this is not a problem. The regionMatches() in the String class has an overload which supports case-insensitive comparison:
boolean regionMatches(boolean ignoreCase, int toffset, String other,
int ooffset, int len);
So if we want to modify our algorithm to also find and replace input tags which are the same but are written using different letter case, all we have to modify is this one line:
if (src.regionMatches(true, ltIdx, key, 0, key.length())) {
Using this modified code, replaceable tags become case-insensitive:
Yo<H1>TITLE</H1><h3>Hi!</h3>Nice day.<H6>Hi back!</H6>End
Yo<big><big><big><b>TITLE</b></big></big></big><big>Hi!</big>Nice day.
<small>Hi back!</small>End
For performance - use StringBuilder.
For convenience you can use Map to store values and replacements.
Map<String, String> map = new HashMap<>();
map.put("<h1>","<big><big><big><b>");
map.put("</h1>","</b></big></big></big>");
map.put("<h2>","<big><big>");
...
StringBuilder builder = new StringBuilder(yourString);
for (String key : map.keySet()) {
replaceAll(builder, key, map.get(key));
}
... To replace all occurences in StringBuilder you can check here:
Replace all occurrences of a String using StringBuilder?
public static void replaceAll(StringBuilder builder, String from, String to)
{
int index = builder.indexOf(from);
while (index != -1)
{
builder.replace(index, index + from.length(), to);
index += to.length(); // Move to the end of the replacement
index = builder.indexOf(from, index);
}
}
Unfortunately StringBuilder doesn't provide a replace(string,string) method, so you might want to consider using Pattern and Matcher in conjunction with StringBuffer:
String input = ...;
StringBuffer sb = new StringBuffer();
Pattern p = Pattern.compile("</?(h1|h2|...)>");
Matcher m = p.matcher( input );
while( m.find() )
{
String match = m.group();
String replacement = ...; //get replacement for match, e.g. by lookup in a map
m.appendReplacement( sb, replacement );
}
m.appendTail( sb );
You could do something similar with StringBuilder but in that case you'd have to implement appendReplacement etc. yourself.
As for the expression you could also just try and match any html tag (although that might cause problems since regex and arbitrary html don't fit very well) and when the lookup doesn't have any result you just replace the match with itself.
The particular example you provide seems to be HTML or XHTML. Trying to edit HTML or XML using regular expressions is frought with problems. For the kind of editing you seem to be interested in doing you should look at using XSLT. Another possibility is to use SAX, the streaming XML parser, and have your back-end write the edited output on the fly. If the text is actually HTML, you might be better using a tolerant HTML parser, such as JSoup, to build a parsed representation of the document (like the DOM), and manipulate that before outputting it.
StringBuilder is backed by a char array. So, unlike String instances, it is mutable. Thus, you can call indexOf() and replace() on the StringBuilder.
I would do something like this
StringBuilder sb = new StringBuilder();
for (int i = 0; i < str.length(); i++) {
if (tagEquals(str, i, "h1")) {
sb.append("<big><big><big><b>");
i += 2;
} else (tagEquals(s, i, "/h1")) {
...
} else {
sb.append(str.charAt(i));
}
}
tagEquals is a func which checks a tag name
Use Apache Commons StringUtils.replaceEach.
String[] searches = new String[]{"<h1>", "</h1>", "<h2>", ...};
String[] replacements = new String[]("<big><big><big><b>", "</b></big></big></big>", "<big><big>" ...};
string = StringUtils.replaceEach(string, searches, replacements);

Unused variable in a loop

I need to build a pattern string according to an argument list. If the arguments are "foo", "bar", "data", then pattern should be: "?, ?, ?"
My code is:
List<String> args;
...
for(String s : args) {
pattern += "?,";
}
pattern = pattern.substring(0, pattern.length()-1);
It works fine, the only concern is, s is not used, it seems the code is a little dirty.
Any improvements for this?
I hope something like:
for(args.size()) {
...
}
But apparently there isn't..
You could use the class for loop with conditions:
for (int i = 0, s < args.size(); i++)
In this case, i is being used as a counting variable.
Other than that, there aren't any improvements to be made, although there isn't a need for improvements.
I usually do that in Haskell / Python style - naming it with "_". That way it's sort of obvious that variable is intentionally unused:
int n = 0;
for (final Object _ : iterable) { ++n; }
IntelliJ still complains, though :)
Another option is to use the Java Stream's api. It's pretty neat.
String output = args
.stream()
.map( string -> "?" ) // transform each string into a ?
.collect( Collectors.joining( "," ) ); // collect and join on ,
Why not use
for (int i = 0; i < args.size(); i++) {
...
}
You use the for each block if you want to make use of the contents of whatever you iterate on. For example, you use for (String s : args) if you know you're going to use each String value present in args. And it looks like here, you don't need the actual Strings.
If you have Guava around, you could try combining Joiner with Collections.nCopies:
Joiner.on(", ").join(Collections.nCopies(args.size(), "?"));
What you're looking for is a concept known as a "join". In stronger languages, like Groovy, it's available in the standard library, and you could write, for instance args.join(',') to get what you want. With Java, you can get a similar effect with StringUtils.join(args, ",") from Commons Lang (a library that should be included in every Java project everywhere).
Update: I obviously missed an important part with my original answer. The list of strings needs to be turned into question marks first. Commons Collections, another library that should always be included in Java apps, lets you do that with CollectionUtils.transform(args,new ConstantTransformer<String, String>("?")). Then pass the result to the join that I originally mentioned. Of course, this is getting a bit unwieldy in Java, and a more imperative approach might be more appropriate.
For the sake of comparison, the entire thing can be solved in Groovy and many other languages with something like args.collect{'?'}.join(','). In Java, with the utilities I mentioned, that goes more like:
StringUtils.join(
CollectionUtils.transform(args,
new ConstantTransformer<String, String>("?")),
",");
Quite a bit less readable...
I would suggest using StringBuilder along with the classic for loop here.
String pattern = "";
if (args.size() > 0) {
StringBuilder sb = new StringBuilder("?");
for(int i = 1; i < args.size(); i++) {
sb.append(", ?");
}
pattern = sb.toString();
}
If you don't want to use a for loop (as you stated not concise enough) use a while instead:
int count;
String pattern = "";
if ((count = args.size()) > 0) {
StringBuilder sb = new StringBuilder("?");
while (count-- > 1) {
sb.append(", ?");
}
pattern = sb.toString();
}
Also, see When to use StringBuilder?
At the point where you're concatenating in a loop - that's usually when the compiler can't substitute StringBuilder by itself.

Should StringBuilders always be used for string manipulation? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
when to use StringBuilder in java
If not which of these pieces of code is better and why
public String backAround(String str) {
int len = str.length();
StringBuilder sb = new StringBuilder(str);
char chEnd = sb.charAt(len-1);
if(len > 0){
sb = sb.append(chEnd);
sb= sb.insert(0,chEnd);
str= sb.toString();
return str;
}else{ return str;}
}
or
public String backAround(String str) {
// Get the last char
String back = str.substring(str.length()-1, str.length());
return back + str + back;
}
If you are just "sticking a few elements together" as in your backAround() method, you may as well just use the + notation. The compiler will convert this into appropriate StringBuilder.append()s for you, so why bother 'spelling things out'.
The idea of explicitly using StringBuilder is that in principle you can hand-optimise how exactly the elements are appended to the string, including setting the initial buffer capacity and ensuring that you don't accidentally create intermediate String objects that are unnecessary in cases where the compiler might not predict these things.
So essentially, explicitly use a StringBuilder when there is slightly more complex logic to deciding what to append to the string. For example, if you are appending things in a loop, or where what is appended depends on various conditions at different points. Another case where you might use StringBuilder is if the string needs to be built up from various methods, for example: you can then pass the StringBuilder into the different methods and ask them to append the various elements.
P.S. I should say that StringBuilder buys you a little more editing power as well (e.g. among other things, you can set its length) and, given the presence of the Appendable interface, you can actually create more generic methods that either append to a StringBuilder or to e.g. a StringWriter. But these are marginal cases, I would submit.
It really depends on what you are trying to do. In your case it seems like your trying to take a string and take the last letter and add it to the front and then add another to the end. For this i would probably do this:
public String manipulate(String string)
{
char c = string.charAt(string.length);
return c + string + c;
}
In this case you didn't have to use a StringBuilder. There are cases where the StringBuilder class is useful. Here are some things that are hard to do with a String that StringBuilder can do:
delete chars at an index
append chars at an index
get the index of a specific sequence
and much much more
if you want to see the documentation for StringBuilder:
String Builder
I hope this helped you out!

Using String or StringBuffer in Java: which is better?

I read a lot about using StringBuffer and String especially where concatenation is concerned in Java and whether one is thread safe or not.
So, in various Java methods, which should be used?
For example, in a PreparedStatement, should query be a StringBuffer:
String query = ("SELECT * " +
"FROM User " +
"WHERE userName = ?;");
try {
ps = connection.prepareStatement(query);
And then again, in a String utility methods like:
public static String prefixApostrophesWithBackslash(String stringIn) {
String stringOut = stringIn.replaceAll("'", "\\\\'");
return stringOut;
}
And:
// Removes a char from a String.
public static String removeChar(String stringIn, char c) {
String stringOut = ("");
for (int i = 0; i < stringIn.length(); i++) {
if (stringIn.charAt(i) != c) {
stringOut += stringIn.charAt(i);
}
}
return stringOut;
}
Should I be using StringBuffers? Especially where repalceAll is not available for such objects anyway.
Thanks
Mr Morgan.
Thanks for all the advice. StringBuffers have been replaced with StringBuilders and Strings replaced with StringBuilders where I've thought it best.
You almost never need to use StringBuffer.
Instead of StringBuffer you probably mean StringBuilder. A StringBuffer is like a StringBuilder except that it also offers thread safety. This thread safety is rarely needed in practice and will just cause your code to run more slowly.
Your question doesn't seem to be about String vs StringBuffer, but about using built-in methods or implementing the code yourself. If there is a built-in method that does exactly what you want, you should probably use it. The chances are it is much better optimized than the code you would write.
There is no simple answer (apart from repeating the mantra of StringBuilder versus StringBuffer ... ). You really have understand a fair bit about what goes on "under the hood" in order to pick the most efficient solution.
In your first example, String is the way to go. The Java compiler can generate pretty much optimal code (using a StringBuilder if necessary) for any expression consisting of a sequence of String concatenations. And, if the strings that are concatenated are all constants or literals, the compiler can actually do the concatenation at compile time.
In your second example, it is not entirely clear whether String or StringBuilder would be better ... or whether they would be roughly equivalent. One would need to look at the code of the java.util.regex.Matcher class to figure this out.
EDIT - I looked at the code, and actually it makes little difference whether you use a String or StringBuilder as the source. Internally the Matcher.replaceAll method creates a new StringBuilder and fills it by appending chunks from the source String and the replacement String.
In your third example, a StringBuilder would clearly be best. A current generation Java compiler is not able to optimize the code (as written) to avoid creating a new String as each character is added.
For the below segment of code
// Removes a char from a String.
public static String removeChar(String stringIn, char c) {
String stringOut = ("");
for (int i = 0; i < stringIn.length(); i++) {
if (stringIn.charAt(i) != c) {
stringOut += stringIn.charAt(i);
}
}
return stringOut;
}
You could just do stringIn.replaceAll(c+"","")
Even in MT code, it's unusual to have multiple threads append stuff to a string. StringBuilder is almost always preferred to StringBuffer.
Modern compilers optimize the code already. So some String additions will be optimized to use StringBuilder and we can keep the String additions if we think, it increases readibility.
Example 1:
String query = ("SELECT * " +
"FROM User " +
"WHERE userName = ?;");
will be optimized to somthing like:
StringBuiler sb = new StringBuilder();
sb.append("SELECT * ");
sb.append("FROM User ");
sb.append("WHERE userName = ?;");
String query = sb.toString();
Example 2:
String numbers = "";
for (int i = 0;i < 20; i++)
numbers = numbers + i;
This can't be optimized and we should use a StringBuilder in code.
I made this observation for SUN jdk1.5+. So for older Java versions or different jdks it can be different. There it could be save to always code StringBuilder (or StringBuffer for jdk 1.4.2 and older).
For cases which can be considered single threaded, the best would be StringBuilder. It does not add any synchronization overhead, while StringBuffer does.
String concatenation by '+' operator is "good" only when you're lazy to use StringBuilder or just want to keep the code easily readable and it is acceptable from performance point of view, like in startup log message "LOG.info("Starting instance " + inst_id + " of " + app_name);"

What's a good way of building up a String given specific start and end locations?

(java 1.5)
I have a need to build up a String, in pieces. I'm given a set of (sub)strings, each with a start and end point of where they belong in the final string. Was wondering if there were some canonical way of doing this. This isn't homework, and I can use any licensable OSS, such as jakarta commons-lang StringUtils etc.
My company has a solution using a CharBuffer, and I'm content to leave it as is (and add some unit tests, of which there are none (?!)) but the code is fairly hideous and I would like something easier to read.
As I said this isn't homework, and I don't need a complete solution, just some pointers to libraries or java classes that might give me some insight. The String.Format didn't seem QUITE right...
I would have to honor inputs too long and too short, etc. Substrings would be overlaid in the order they appear (in case of overlap).
As an example of input, I might have something like:
String:start:end
FO:0:3 (string shorter than field)
BAR:4:5 (String larger than field)
BLEH:5:9 (String overlays previous field)
I'd want to end up with
FO BBLEH
01234567890
(Edit: To all - StringBuilder (and specifically, the "pre-allocate to a known length, then use .replace()" theme) seems to be what I'm thinking of. Thanks to all who suggested it!)
StringBuilder output = new StringBuilder();
// for each input element
{
while (output.length() < start)
{
output.append(' ');
}
output.replace(start, end, string);
}
You could also establish the final size of output before inserting any string into it. You could make a first pass through the input elements to find the largest end. This will be the final size of output.
char[] spaces = new char[size];
Arrays.fill(spaces, ' ');
output.append(spaces);
Will StringBuilder do?
StringBuilder sb = new StringBuilder();
sb.setLength(20);
sb.replace(0, 3, "FO");
sb.replace(4, 5, "BAR");
sb.replace(5, 9, "BLEH");
System.out.println("[" + sb.toString().replace('\0', ' ') + "]");
// prints "[FO BBLEH ]"
If I understand your requirements correctly, you should be able to do this with the standard java.lang.StringBuilder:
public class StringAssembler
{
private final StringBuilder builder = new StringBuilder();
public void addPiece(String input, int start, int end)
{
final String actualInput = input.substring(0, end-start+1);
builder.insert(start, actualInput);
}
public String getFullString()
{
return builder.toString();
}
}
In particular, I don't think that the end parameter is strictly necessary, in that all it can do is change the length of the input string, hence the two steps in my addPiece method.
Note that this is not tested, and probably doesn't do the right thing in edge cases, but it should give you something to start from.
You can use StringUtils.rightPad(str, size) to add the necessary number of spaces. And you can use the following to strip the unneeded characters:
if (str.length() > size) {
str = str.substring(size);
}

Categories

Resources