Should StringBuilders always be used for string manipulation? [duplicate] - java

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
when to use StringBuilder in java
If not which of these pieces of code is better and why
public String backAround(String str) {
int len = str.length();
StringBuilder sb = new StringBuilder(str);
char chEnd = sb.charAt(len-1);
if(len > 0){
sb = sb.append(chEnd);
sb= sb.insert(0,chEnd);
str= sb.toString();
return str;
}else{ return str;}
}
or
public String backAround(String str) {
// Get the last char
String back = str.substring(str.length()-1, str.length());
return back + str + back;
}

If you are just "sticking a few elements together" as in your backAround() method, you may as well just use the + notation. The compiler will convert this into appropriate StringBuilder.append()s for you, so why bother 'spelling things out'.
The idea of explicitly using StringBuilder is that in principle you can hand-optimise how exactly the elements are appended to the string, including setting the initial buffer capacity and ensuring that you don't accidentally create intermediate String objects that are unnecessary in cases where the compiler might not predict these things.
So essentially, explicitly use a StringBuilder when there is slightly more complex logic to deciding what to append to the string. For example, if you are appending things in a loop, or where what is appended depends on various conditions at different points. Another case where you might use StringBuilder is if the string needs to be built up from various methods, for example: you can then pass the StringBuilder into the different methods and ask them to append the various elements.
P.S. I should say that StringBuilder buys you a little more editing power as well (e.g. among other things, you can set its length) and, given the presence of the Appendable interface, you can actually create more generic methods that either append to a StringBuilder or to e.g. a StringWriter. But these are marginal cases, I would submit.

It really depends on what you are trying to do. In your case it seems like your trying to take a string and take the last letter and add it to the front and then add another to the end. For this i would probably do this:
public String manipulate(String string)
{
char c = string.charAt(string.length);
return c + string + c;
}
In this case you didn't have to use a StringBuilder. There are cases where the StringBuilder class is useful. Here are some things that are hard to do with a String that StringBuilder can do:
delete chars at an index
append chars at an index
get the index of a specific sequence
and much much more
if you want to see the documentation for StringBuilder:
String Builder
I hope this helped you out!

Related

Alternative to successive String.replace

I want to replace some strings in a String input :
string=string.replace("<h1>","<big><big><big><b>");
string=string.replace("</h1>","</b></big></big></big>");
string=string.replace("<h2>","<big><big>");
string=string.replace("</h2>","</big></big>");
string=string.replace("<h3>","<big>");
string=string.replace("</h3>","</big>");
string=string.replace("<h4>","<b>");
string=string.replace("</h4>","</b>");
string=string.replace("<h5>","<small><b>");
string=string.replace("</h5>","</b><small>");
string=string.replace("<h6>","<small>");
string=string.replace("</h6>","</small>");
As you can see this approach is not the best, because each time I have to search for the portion to replace etc, and Strings are immutable... Also the input is large, which means that some performance issues are to be considered.
Is there any better approach to reduce the complexity of this code ?
Although StringBuilder.replace() is a huge improvement compared to String.replace(), it is still very far from being optimal.
The problem with StringBuilder.replace() is that if the replacement has different length than the replaceable part (applies to our case), a bigger internal char array might have to be allocated, and the content has to be copied, and then the replace will occur (which also involves copying).
Imagine this: You have a text with 10.000 characters. If you want to replace the "XY" substring found at position 1 (2nd character) to "ABC", the implementation has to reallocate a char buffer which is at least larger by 1, has to copy the old content to the new array, and it has to copy 9.997 characters (starting at position 3) to the right by 1 to fit "ABC" into the place of "XY", and finally characters of "ABC" are copied to the starter position 1. This has to be done for every replace! This is slow.
Faster Solution: Building Output On-The-Fly
We can build the output on-the-fly: parts that don't contain replaceable texts can simply be appended to the output, and if we find a replaceable fragment, we append the replacement instead of it. Theoretically it's enough to loop over the input only once to generate the output. Sounds simple, and it's not that hard to implement it.
Implementation:
We will use a Map preloaded with mappings of the replaceable-replacement strings:
Map<String, String> map = new HashMap<>();
map.put("<h1>", "<big><big><big><b>");
map.put("</h1>", "</b></big></big></big>");
map.put("<h2>", "<big><big>");
map.put("</h2>", "</big></big>");
map.put("<h3>", "<big>");
map.put("</h3>", "</big>");
map.put("<h4>", "<b>");
map.put("</h4>", "</b>");
map.put("<h5>", "<small><b>");
map.put("</h5>", "</b></small>");
map.put("<h6>", "<small>");
map.put("</h6>", "</small>");
And using this, here is the replacer code: (more explanation after the code)
public static String replaceTags(String src, Map<String, String> map) {
StringBuilder sb = new StringBuilder(src.length() + src.length() / 2);
for (int pos = 0;;) {
int ltIdx = src.indexOf('<', pos);
if (ltIdx < 0) {
// No more '<', we're done:
sb.append(src, pos, src.length());
return sb.toString();
}
sb.append(src, pos, ltIdx); // Copy chars before '<'
// Check if our hit is replaceable:
boolean mismatch = true;
for (Entry<String, String> e : map.entrySet()) {
String key = e.getKey();
if (src.regionMatches(ltIdx, key, 0, key.length())) {
// Match, append the replacement:
sb.append(e.getValue());
pos = ltIdx + key.length();
mismatch = false;
break;
}
}
if (mismatch) {
sb.append('<');
pos = ltIdx + 1;
}
}
}
Testing it:
String in = "Yo<h1>TITLE</h1><h3>Hi!</h3>Nice day.<h6>Hi back!</h6>End";
System.out.println(in);
System.out.println(replaceTags(in, map));
Output: (wrapped to avoid scroll bar)
Yo<h1>TITLE</h1><h3>Hi!</h3>Nice day.<h6>Hi back!</h6>End
Yo<big><big><big><b>TITLE</b></big></big></big><big>Hi!</big>Nice day.
<small>Hi back!</small>End
This solution is faster than using regular expressions as that involves much overhead, like compiling a Pattern, creating a Matcher etc. and regexp is also much more general. It also creates many temporary objects under the hood which are thrown away after the replace. Here I only use a StringBuilder (plus char array under its hood) and the code iterates over the input String only once. Also this solution is much faster that using StringBuilder.replace() as detailed at the top of this answer.
Notes and Explanation
I initialized the StringBuilder in the replaceTags() method like this:
StringBuilder sb = new StringBuilder(src.length() + src.length() / 2);
So basically I created it with an initial capacity of 150% of the length of the original String. This is because our replacements are longer than the replaceable texts, so if replacing occurs, the output will obviously be longer than the input. Giving a larger initial capacity to StringBuilder will result in no internal char[] reallocation at all (of course the required initial capacity depends on the replaceable-replacement pairs and their frequency/occurrence in the input, but this +50% is a good upper estimation).
I also utilized the fact that all replaceable strings start with a '<' character, so finding the next potential replaceable position becomes blazing-fast:
int ltIdx = src.indexOf('<', pos);
It's just a simple loop and char comparisons inside String, and since it always starts searching from pos (and not from the start of the input), overall the code iterates over the input String only once.
And finally to tell if a replaceable String does occur at the potential position, we use the String.regionMatches() method to check the replaceable stings which is also blazing-fast as all it does is just compares char values in a loop and returns at the very first mismatching character.
And a PLUS:
The question doesn't mention it, but our input is an HTML document. HTML tags are case-insensitive which means the input might contain <H1> instead of <h1>.
To this algorithm this is not a problem. The regionMatches() in the String class has an overload which supports case-insensitive comparison:
boolean regionMatches(boolean ignoreCase, int toffset, String other,
int ooffset, int len);
So if we want to modify our algorithm to also find and replace input tags which are the same but are written using different letter case, all we have to modify is this one line:
if (src.regionMatches(true, ltIdx, key, 0, key.length())) {
Using this modified code, replaceable tags become case-insensitive:
Yo<H1>TITLE</H1><h3>Hi!</h3>Nice day.<H6>Hi back!</H6>End
Yo<big><big><big><b>TITLE</b></big></big></big><big>Hi!</big>Nice day.
<small>Hi back!</small>End
For performance - use StringBuilder.
For convenience you can use Map to store values and replacements.
Map<String, String> map = new HashMap<>();
map.put("<h1>","<big><big><big><b>");
map.put("</h1>","</b></big></big></big>");
map.put("<h2>","<big><big>");
...
StringBuilder builder = new StringBuilder(yourString);
for (String key : map.keySet()) {
replaceAll(builder, key, map.get(key));
}
... To replace all occurences in StringBuilder you can check here:
Replace all occurrences of a String using StringBuilder?
public static void replaceAll(StringBuilder builder, String from, String to)
{
int index = builder.indexOf(from);
while (index != -1)
{
builder.replace(index, index + from.length(), to);
index += to.length(); // Move to the end of the replacement
index = builder.indexOf(from, index);
}
}
Unfortunately StringBuilder doesn't provide a replace(string,string) method, so you might want to consider using Pattern and Matcher in conjunction with StringBuffer:
String input = ...;
StringBuffer sb = new StringBuffer();
Pattern p = Pattern.compile("</?(h1|h2|...)>");
Matcher m = p.matcher( input );
while( m.find() )
{
String match = m.group();
String replacement = ...; //get replacement for match, e.g. by lookup in a map
m.appendReplacement( sb, replacement );
}
m.appendTail( sb );
You could do something similar with StringBuilder but in that case you'd have to implement appendReplacement etc. yourself.
As for the expression you could also just try and match any html tag (although that might cause problems since regex and arbitrary html don't fit very well) and when the lookup doesn't have any result you just replace the match with itself.
The particular example you provide seems to be HTML or XHTML. Trying to edit HTML or XML using regular expressions is frought with problems. For the kind of editing you seem to be interested in doing you should look at using XSLT. Another possibility is to use SAX, the streaming XML parser, and have your back-end write the edited output on the fly. If the text is actually HTML, you might be better using a tolerant HTML parser, such as JSoup, to build a parsed representation of the document (like the DOM), and manipulate that before outputting it.
StringBuilder is backed by a char array. So, unlike String instances, it is mutable. Thus, you can call indexOf() and replace() on the StringBuilder.
I would do something like this
StringBuilder sb = new StringBuilder();
for (int i = 0; i < str.length(); i++) {
if (tagEquals(str, i, "h1")) {
sb.append("<big><big><big><b>");
i += 2;
} else (tagEquals(s, i, "/h1")) {
...
} else {
sb.append(str.charAt(i));
}
}
tagEquals is a func which checks a tag name
Use Apache Commons StringUtils.replaceEach.
String[] searches = new String[]{"<h1>", "</h1>", "<h2>", ...};
String[] replacements = new String[]("<big><big><big><b>", "</b></big></big></big>", "<big><big>" ...};
string = StringUtils.replaceEach(string, searches, replacements);

Difference between String, String Builders, Character Arrays and Arraylist [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Strings are immutable. Stringbuilders are not, so you can append characters at the end. Strings are character arrays if i am not wrong, than why do we use character arrays separately and Strings separately, Do we really need to use character arrays?
Secondly, there are character arrays and then there are Arraylists. Array lists holds complete objects? I am a bit confused actually.
String cat = "c" + "a" + "t";
cat = cat + cat;
StringBuilder sb = new StringBuilder();
sb.append(city);
sb.append(", ");
sb.append(state);
sb.toString();
Char apple[5]={'a','p','p','l','e'};
Arraylist<MyCar>obj = new Arraylist<MyCar>();
Which should be used where?
This Explain the best: between string and stringBuilder
Ref:Correct way to use StringBuilder
Note that the aim (usually) is to reduce memory churn rather than total memory used, to make life a bit easier on the garbage collector.
Will that take memory equal to using String like below?
No, it'll cause more memory churn than just the straight concat you quoted. (Until/unless the JVM optimizer sees that the explicit StringBuilder in the code is unnecessary and optimizes it out, if it can.)
If the author of that code wants to use StringBuilder (there are arguments for, but also against; see note at the end of this answer), better to do it properly (here I'm assuming there aren't actually quotes around id2 and table):
StringBuilder sb = new StringBuilder(some_appropriate_size);
sb.append("select id1, ");
sb.append(id2);
sb.append(" from ");
sb.append(table);
return sb.toString();
Note that I've listed some_appropriate_size in the StringBuilder constructor, so that it starts out with enough capacity for the full content we're going to append. The default size used if you don't specify one is 16 characters, which is usually too small and results in the StringBuilder having to do reallocations to make itself bigger (IIRC, in the Sun/Oracle JDK, it doubles itself [or more, if it knows it needs more to satisfy a specific append] each time it runs out of room).
You may have heard that string concatenation will use a StringBuilder under the covers if compiled with the Sun/Oracle compiler. This is true, it will use one StringBuilder for the overall expression. But it will use the default constructor, which means in the majority of cases, it will have to do a reallocation. It's easier to read, though. Note that this is not true of a series of concatenations. So for instance, this uses one StringBuilder:
return "prefix " + variable1 + " middle " + variable2 + " end";
It roughly translates to:
StringBuilder tmp = new StringBuilder(); // Using default 16 character size
tmp.append("prefix ");
tmp.append(variable1);
tmp.append(" middle ");
tmp.append(variable2);
tmp.append(" end");
return tmp.toString();
So that's okay, although the default constructor and subsequent reallocation(s) isn't ideal, the odds are it's good enough — and the concatenation is a lot more readable.
But that's only for a single expression. Multiple StringBuilders are used for this:
String s;
s = "prefix ";
s += variable1;
s += " middle ";
s += variable2;
s += " end";
return s;
That ends up becoming something like this:
String s;
StringBuilder tmp;
s = "prefix ";
tmp = new StringBuilder();
tmp.append(s);
tmp.append(variable1);
s = tmp.toString();
tmp = new StringBuilder();
tmp.append(s);
tmp.append(" middle ");
s = tmp.toString();
tmp = new StringBuilder();
tmp.append(s);
tmp.append(variable2);
s = tmp.toString();
tmp = new StringBuilder();
tmp.append(s);
tmp.append(" end");
s = tmp.toString();
return s;
...which is pretty ugly.
It's important to remember, though, that in all but a very few cases it doesn't matter and going with readability (which enhances maintainability) is preferred barring a specific performance issue.
Normally String is used for normal string based requirement, and when a String can suffice it.
String Builder is used whenever you want to manipulate and play with the string.
Character Array is used when you want to easily iterate over each and every character
ArrayList is a collection. Use it for holding object of a particular type.
String is immutable object that includes underlying char array.
In your line 2 you discard your String that you created in line 1 and create a new String.
String builder avoids creating new String objects for every separate substring.
Both arrays and Arraylist can contain objects, the main difference is that Arraylist can grow, arrays can not. The second difference is that Arraylist is really a List...
A String uses a char[]. A String is not a char[], in the same way that an ArrayList<String> is not a String[].
ArrayList type is a dynamic data structure. This means that it can grow depending on need. Array is static, meaning it's dimensions do not change over its lifetime.

Java - Generating strings of length x

I have some 'heavy' string manipulation in my Java program, which often involves iterating through a String and replacing certain segments with filler characters, usually "#". These are characters are later removed but are used so that the length of the String and the current index are kept intact during the iteration.
This process usually involves replacing more than 1 character at a time.
e.g.
I might need to replace "cat" with "###" in the string "I love cats", giving "I love ###s",
So often I need to create strings of "#" with x length.
In python, this is easy.
NewString = "#" *x
In Java, I find my current method revolting.
String NewString = "";
for (int i=0; i< x; i++) {
NewString = NewString.concat("#"); }
Is there a proper, pre-established method for doing this?
Does anybody have a shorter, more 'golfed' method?
Thanks!
Specs:
Java SE (Jre7)
Windows 7 (32)
It's not clear to me what kind of regex the comments are suggesting, but creating a string filled with a particular character to the given length is pretty easy:
public static String createString(char character, int length) {
char[] chars = new char[length];
Arrays.fill(chars, character);
return new String(chars);
}
Guava has a nice little method Strings.repeat(String, int). Looking at the source of that method, it basically amounts to this:
StringBuilder builder = new StringBuilder(string.length() * count);
for (int i = 0; i < count; i++) {
builder.append(string);
}
return builder.toString();
Your way of building a string of length N is very inefficient. You should either use StringBuffer with its convenient append method, or build an array of N characters, and use the corresponding constructor of the String.
Can you always use the same characters in the "filler" String and do you know the maximum value of x? The you can create a constant upfront which can be cut to arbitrary length:
private static final FILLER = "##############################################";
// inside your method
String newString = FILLER.substring(0, x);
java.lang.String is immutable. So, concating strings would result in creation of temporary string objects and thus is slow. You should consider using a mutable buffer like StringBuffer or StringBuilder. Another best practice when working with strings in java is to prefer using CharSequence type wherever possible. This would avoid unnecessary calls to toString() and you can easily change the underlying implementation type.
If you are looking for a one liner to repeat strings and this justifies using an external library, have a look at StringUtils.repeat from Apache Commons library. But, I feel you can just write your own code than using another library for a trivial task of repeating strings.

Using String or StringBuffer in Java: which is better?

I read a lot about using StringBuffer and String especially where concatenation is concerned in Java and whether one is thread safe or not.
So, in various Java methods, which should be used?
For example, in a PreparedStatement, should query be a StringBuffer:
String query = ("SELECT * " +
"FROM User " +
"WHERE userName = ?;");
try {
ps = connection.prepareStatement(query);
And then again, in a String utility methods like:
public static String prefixApostrophesWithBackslash(String stringIn) {
String stringOut = stringIn.replaceAll("'", "\\\\'");
return stringOut;
}
And:
// Removes a char from a String.
public static String removeChar(String stringIn, char c) {
String stringOut = ("");
for (int i = 0; i < stringIn.length(); i++) {
if (stringIn.charAt(i) != c) {
stringOut += stringIn.charAt(i);
}
}
return stringOut;
}
Should I be using StringBuffers? Especially where repalceAll is not available for such objects anyway.
Thanks
Mr Morgan.
Thanks for all the advice. StringBuffers have been replaced with StringBuilders and Strings replaced with StringBuilders where I've thought it best.
You almost never need to use StringBuffer.
Instead of StringBuffer you probably mean StringBuilder. A StringBuffer is like a StringBuilder except that it also offers thread safety. This thread safety is rarely needed in practice and will just cause your code to run more slowly.
Your question doesn't seem to be about String vs StringBuffer, but about using built-in methods or implementing the code yourself. If there is a built-in method that does exactly what you want, you should probably use it. The chances are it is much better optimized than the code you would write.
There is no simple answer (apart from repeating the mantra of StringBuilder versus StringBuffer ... ). You really have understand a fair bit about what goes on "under the hood" in order to pick the most efficient solution.
In your first example, String is the way to go. The Java compiler can generate pretty much optimal code (using a StringBuilder if necessary) for any expression consisting of a sequence of String concatenations. And, if the strings that are concatenated are all constants or literals, the compiler can actually do the concatenation at compile time.
In your second example, it is not entirely clear whether String or StringBuilder would be better ... or whether they would be roughly equivalent. One would need to look at the code of the java.util.regex.Matcher class to figure this out.
EDIT - I looked at the code, and actually it makes little difference whether you use a String or StringBuilder as the source. Internally the Matcher.replaceAll method creates a new StringBuilder and fills it by appending chunks from the source String and the replacement String.
In your third example, a StringBuilder would clearly be best. A current generation Java compiler is not able to optimize the code (as written) to avoid creating a new String as each character is added.
For the below segment of code
// Removes a char from a String.
public static String removeChar(String stringIn, char c) {
String stringOut = ("");
for (int i = 0; i < stringIn.length(); i++) {
if (stringIn.charAt(i) != c) {
stringOut += stringIn.charAt(i);
}
}
return stringOut;
}
You could just do stringIn.replaceAll(c+"","")
Even in MT code, it's unusual to have multiple threads append stuff to a string. StringBuilder is almost always preferred to StringBuffer.
Modern compilers optimize the code already. So some String additions will be optimized to use StringBuilder and we can keep the String additions if we think, it increases readibility.
Example 1:
String query = ("SELECT * " +
"FROM User " +
"WHERE userName = ?;");
will be optimized to somthing like:
StringBuiler sb = new StringBuilder();
sb.append("SELECT * ");
sb.append("FROM User ");
sb.append("WHERE userName = ?;");
String query = sb.toString();
Example 2:
String numbers = "";
for (int i = 0;i < 20; i++)
numbers = numbers + i;
This can't be optimized and we should use a StringBuilder in code.
I made this observation for SUN jdk1.5+. So for older Java versions or different jdks it can be different. There it could be save to always code StringBuilder (or StringBuffer for jdk 1.4.2 and older).
For cases which can be considered single threaded, the best would be StringBuilder. It does not add any synchronization overhead, while StringBuffer does.
String concatenation by '+' operator is "good" only when you're lazy to use StringBuilder or just want to keep the code easily readable and it is acceptable from performance point of view, like in startup log message "LOG.info("Starting instance " + inst_id + " of " + app_name);"

What's a good way of building up a String given specific start and end locations?

(java 1.5)
I have a need to build up a String, in pieces. I'm given a set of (sub)strings, each with a start and end point of where they belong in the final string. Was wondering if there were some canonical way of doing this. This isn't homework, and I can use any licensable OSS, such as jakarta commons-lang StringUtils etc.
My company has a solution using a CharBuffer, and I'm content to leave it as is (and add some unit tests, of which there are none (?!)) but the code is fairly hideous and I would like something easier to read.
As I said this isn't homework, and I don't need a complete solution, just some pointers to libraries or java classes that might give me some insight. The String.Format didn't seem QUITE right...
I would have to honor inputs too long and too short, etc. Substrings would be overlaid in the order they appear (in case of overlap).
As an example of input, I might have something like:
String:start:end
FO:0:3 (string shorter than field)
BAR:4:5 (String larger than field)
BLEH:5:9 (String overlays previous field)
I'd want to end up with
FO BBLEH
01234567890
(Edit: To all - StringBuilder (and specifically, the "pre-allocate to a known length, then use .replace()" theme) seems to be what I'm thinking of. Thanks to all who suggested it!)
StringBuilder output = new StringBuilder();
// for each input element
{
while (output.length() < start)
{
output.append(' ');
}
output.replace(start, end, string);
}
You could also establish the final size of output before inserting any string into it. You could make a first pass through the input elements to find the largest end. This will be the final size of output.
char[] spaces = new char[size];
Arrays.fill(spaces, ' ');
output.append(spaces);
Will StringBuilder do?
StringBuilder sb = new StringBuilder();
sb.setLength(20);
sb.replace(0, 3, "FO");
sb.replace(4, 5, "BAR");
sb.replace(5, 9, "BLEH");
System.out.println("[" + sb.toString().replace('\0', ' ') + "]");
// prints "[FO BBLEH ]"
If I understand your requirements correctly, you should be able to do this with the standard java.lang.StringBuilder:
public class StringAssembler
{
private final StringBuilder builder = new StringBuilder();
public void addPiece(String input, int start, int end)
{
final String actualInput = input.substring(0, end-start+1);
builder.insert(start, actualInput);
}
public String getFullString()
{
return builder.toString();
}
}
In particular, I don't think that the end parameter is strictly necessary, in that all it can do is change the length of the input string, hence the two steps in my addPiece method.
Note that this is not tested, and probably doesn't do the right thing in edge cases, but it should give you something to start from.
You can use StringUtils.rightPad(str, size) to add the necessary number of spaces. And you can use the following to strip the unneeded characters:
if (str.length() > size) {
str = str.substring(size);
}

Categories

Resources