Regex or Exception Handling? - java

Which one of the following is a better practice to check if a string is float?
try{
Double.parseDouble(strVal);
}catch(NumberFormatException e){
//My Logic
}
or
if(!strVal.matches("[-+]?\\d*\\.?\\d+")){
//My Logic
}
In terms of performace, maintainence and readability?
And yeah, I would like to know which one is good coding practice?

Personal opinion - of the code I've seen, I would expect that most developers would tend towards the try - catch blocks. The try catch is in a sense also more readable and makes the assumption that for most cases the string will contain a valid number. But there are a number of things to consider with you examples which may effect which you choose.
How often do you expect the string to not contain a valid number.
Note that for bulk processing you should create a Pattern object outside of the loop. This will stop the code from having to recompile the pattern every time.
As a general rule you should never use expectations as logic flow. Your try - catch indicates logic if it's not a string, where as your regex indicates logic if it is a number. So it wasn't obvious what the context of the code is.
If you choose the regex technique, you are still probably going to have to convert at some point, so in effect, it may be a waste of effort.
And finally, is the performance requirements of the application important enough to warrant analysis at this level. Again generally speaking I'd recommend keeping things as simple as possible, making it work, then if there are performance problems, use some code analysis tools to find the bottle necks and tune them out.

Performance: Exceptions are slow, and so is exception-based logic, so second would be faster.
Maintenance / Reliability: The first one is crystal clear and will stay updated with updates to the Java Framework.
That being said, I would personally prefer the first. Performance is something you want to consider as a whole in your architecture, your data structure design, etc. not line by line. Measure for performance and optimize what is actually slow, not what you think might be slow.

The first one is going to perform better than the regex when the string matches the double. For one it's very fast to parse it when the recognizer is hard coded as it would be with Double.parse. Also there's nothing to maintain it's whatever Java defines as the Double is as a string. Not to mention Double.parseDouble() is easier to read.
The other solution isn't going to be compiled so the first thing that the regex has to do is compile and parse the regex expression, then it has to run that expression, then you'll have to execute Double.parseDouble() to get it into a double. And that's going to be done for every number passed to it. You might be able to optimize it with Pattern.compile(), but executing the expression is going to be slower. Especially when you have to run a Double.doubleParse to get the value into a double.
Yes exceptions are not super fast, but you'll only have to pay that price when you parse an error. If you don't plan on seeing lots of errors then I don't think you'll notice the slow down from gathering the stacktrace on the throw (which is why exceptions perform poorly). If you're only going to encounter a handful of exceptions then performance isn't going be a problem. The problem is you expected a double and it wasn't so probably some configuration mistake so tell the user and quit, or pick a suitable default and continue. That's all you can do in those cases.

If you use parseDouble, you will end up with what Mark said, but in a more readable way, and might profit from performance improvements and bug fixes.
Since exceptions are only costly when they are thrown, there is only need to look for a different strategy if you
expect wrong formats to happen often
expect them to fall in a specific pattern which you can catch faster and beforehand
In the end you will call parseDouble either, and therefore it is considered alright to use it that way.
Note that your pattern rejects 7. as a Double, while Java and C/C++ don't, as well as scientific notation like 4.2e8.

May be you can also try this way.But this is generic for a string containing valid number.
public static boolean isNumeric(String str)
{
str = "2.3452342323423424E8";
// str = "21414124.12412412412412";
// str = "123123";
NumberFormat formatter = NumberFormat.getInstance();
ParsePosition pos = new ParsePosition(0);
formatter.parse(str, pos);
return str.length() == pos.getIndex();
}

And yeah, I would like to know which one is good coding practice?
Either can be good coding practice, depending on the context.
If bad numbers are unlikely (i.e. it is an "exceptional" situation), then the exception-based solution is fine. (Indeed, if the probability of bad numbers is small enough, exceptions might even be faster on average. It depends on the relative speed of Double.parseDouble() and a compiled regex for typical input strings. That would need to be measured ...)
If bad numbers are reasonably (or very) likely (i.e. it is NOT an "exceptional" situation), then the regex-based solution is probably better.
If the code path that does the test is infrequently executed, then it really makes no difference which approach you use.

Below is performance test to see the performance difference between regular expression VS try catch for validating a string is numeric.
Below table shows stats with a list(100k) with three points (90%, 70%, 50%) good data(float value) and remaining bad data(strings).
**90% - 10% 70% - 30% 50% - 50%**
**Try Catch** 87234580 122297750 143470144
**Regular Expression** 202700266 192596610 162166308
Performance of try catch is better (unless the bad data is over 50%) even though try/catch may have some impact on performance. The performance impact of try catch is because try/catch prevents JVM from doing some optimizations. Joshua Bloch, in "Effective Java," said the following:. Joshua Bloch, in "Effective Java," said the following:
• Placing code inside a try-catch block inhibits certain optimizations that modern JVM implementations might otherwise perform.
public class PerformanceStats {
static final String regularExpr = "([0-9]*[.])?[0-9]+";
public static void main(String[] args) {
PerformanceStats ps = new PerformanceStats();
ps.statsFinder();
//System.out.println("123".matches(regularExpr));
}
private void statsFinder() {
int count = 200000;
int ncount = 200000;
ArrayList<String> ar = getList(count, ncount);
System.out.println("count = " + count + " ncount = " + ncount);
long t1 = System.nanoTime();
validateWithCatch(ar);
long t2 = System.nanoTime();
validateWithRegularExpression(ar);
long t3 = System.nanoTime();
System.out.println("time taken with Exception " + (t2 - t1) );
System.out.println("time taken with Regular Expression " + (t3 - t2) );
}
private ArrayList<String> getList(int count, int noiseCount) {
Random rand = new Random();
ArrayList<String> list = new ArrayList<String>();
for (int i = 0; i < count; i++) {
list.add((String) ("" + Math.abs(rand.nextFloat())));
}
// adding noise
for (int i = 0; i < (noiseCount); i++) {
list.add((String) ("sdss" + rand.nextInt() ));
}
return list;
}
private void validateWithRegularExpression(ArrayList<String> list) {
ArrayList<Float> ar = new ArrayList<>();
for (String s : list) {
if (s.matches(regularExpr)) {
ar.add(Float.parseFloat(s));
}
}
System.out.println("the size is in regular expression " + ar.size());
}
private void validateWithCatch(ArrayList<String> list) {
ArrayList<Float> ar = new ArrayList<>();
for (String s : list) {
try {
float e = Float.parseFloat(s);
ar.add(e);
} catch (Exception e) {
}
}
System.out.println("the size is in catch block " + ar.size());
}
}

Related

How to get the last best guess from LeastSquareOptimizer?

Is there a way to get the last best guess of the LeastSquaresOptimizer?
I am using Apache Commons Math to perform a least squares optimization. To do this, I must provide a maxEvaluations() and maxIterations() value. The issue is, if the optimization does not converge before it hits the maximum number of evaulations or iterations it returns an org.apache.commons.math4.exception.TooManyIterationsException: illegal state: maximal count (6,000) exceeded: iterations. If this happens, I would like to see what the last best guess of the optimizer was. How do I do this?
LeastSquaresProblem problem = new LeastSquaresBuilder()
.start(new double[]{0,0,0,1,1,1,0,0,0})
.model(costFunc)
.target(gravity)
.lazyEvaluation(false)
.maxEvaluations(150000)
.maxIterations(6000)
.build();
LeastSquaresOptimizer.Optimum optimum;
try {
optimum = new LevenbergMarquardtOptimizer()
.withCostRelativeTolerance(1.0e-10)
.optimize(problem);
} catch (Exception e) {
throw new Exception(e);
}
There's probably a better way, but what about storing the best guess manually? The optimizer calls problem::evaluate(RealVector point) repeatedly. When you replace it by your own function, you get the input and output.
This replacement should be rather trivial using
LeastSquaresProblem wrappedProblem = new LeastSquaresAdapter(problem) {
#Override public LeastSquaresProblem.Evaluation evaluate(RealVector point) {
LeastSquaresProblem.Evaluation result = super(point);
... store the point and the result, if it's an improvement
return result;
}
}
I'm afraid, the data is mutable, so you'll need to clone it, so it won't get overwritten.
I may be wrong as I've got no experience with this library.

Is there a performance impact when using Guava.Preconditions with concatenated strings?

In our code, we often check arguments with Preconditions:
Preconditions.checkArgument(expression, "1" + var + "3");
But sometimes, this code is called very often. Could this have a notable negative impact on performance? Should we switch to
Preconditions.checkArgument(expression, "%s%s%s", 1, var, 3);
?
(i expect the condition true most of the time. False means bug.)
If you expect the check to not throw any exception most of the time, there is no reason to use the string concatenation. You'll lose more time concatenating (using .concat or a StringBuilder) before calling the method than doing it after you're sure you're throwing an exception.
Reversely, if you're throwing an exception, you're already in the slow branch.
It's also noteworthy to mention that Guava uses a custom and faster formatter which accepts only %s. So the loss of time is actually more similar to the standard logger {} handle (in slf4j or log4j 2). But as written above, this is in the case you're already in the slow branch.
In any case, I would strongly recommend against any of your suggestion, but I'd use this one instead:
Preconditions.checkArgument(expression, "1%s3", var);
You should only put variables in %s, not constants to gain marginal speed.
In the case of String literal concatenation, the compiler should do this in compilation time, so no runtime performance hit will occur. At least the standard JDK does this, it is not per specification (so some compilers may not optimize this).
In the case of variables, constant folding won't work, so there will be work in runtime. However, newer Java compilers will replace string concatenation to StringBuilder, which should be more efficient, as it is not immutable, unlike String.
This should be faster than using a formatter, if it is called. However, if you don't except it to be called very often, then this can be slower, as the concatenation always happen, even if the argument is true, and the method does nothing.
Anyway, to wrap it up: I do not think that it is worth to rewrite the existing calls. However, in new code, you can use the formatter without doubts.
I wrote a simple test. Using formatter is much faster as suggested here. The difference in performance grows with the number of calls (performance with formatter does not change O(1)). I guess the garbage collector time grows with number of calls in case of using simple strings.
Here is one sample result:
started with 10000000 calls and 100 runs
formatter: 0.94 (mean per run)
string: 181.11 (mean per run)
Formatter is 192.67021 times faster. (this difference grows with number of calls)
Here is the code (Java 8, Guava 18):
import java.util.concurrent.TimeUnit;
import java.util.function.Consumer;
import com.google.common.base.Preconditions;
import com.google.common.base.Stopwatch;
public class App {
public static void main(String[] args) {
int count = 10000000;
int runs = 100;
System.out.println("started with " + count + " calls and " + runs + "runs");
Stopwatch stopwatch = Stopwatch.createStarted();
run(count, runs, i->fast(i));
stopwatch.stop();
float fastTime = (float)stopwatch.elapsed(TimeUnit.MILLISECONDS)/ runs;
System.out.println("fast: " + fastTime + " (mean per run)");
//
stopwatch.reset();
System.out.println("reseted: "+stopwatch.elapsed(TimeUnit.MILLISECONDS));
stopwatch.start();
run(count, runs, i->slow(i));
stopwatch.stop();
float slowTime = (float)stopwatch.elapsed(TimeUnit.MILLISECONDS)/ runs;
System.out.println("slow: " + slowTime + " (mean per run)");
float times = slowTime/fastTime;
System.out.println("Formatter is " + times + " times faster." );
}
private static void run(int count, int runs, Consumer<Integer> function) {
for(int c=0;c<count;c++){
for(int r=0;r<runs;r++){
function.accept(r);
}
}
}
private static void slow(int i) {
Preconditions.checkArgument(true, "var was " + i);
}
private static void fast(int i) {
Preconditions.checkArgument(true, "var was %s", i);
}
}

Performance difference between assignment and conditional test

This question is specifically geared towards the Java language, but I would not mind feedback about this being a general concept if so. I would like to know which operation might be faster, or if there is no difference between assigning a variable a value and performing tests for values. For this issue we could have a large series of Boolean values that will have many requests for changes. I would like to know if testing for the need to change a value would be considered a waste when weighed against the speed of simply changing the value during every request.
public static void main(String[] args){
Boolean array[] = new Boolean[veryLargeValue];
for(int i = 0; i < array.length; i++) {
array[i] = randomTrueFalseAssignment;
}
for(int i = 400; i < array.length - 400; i++) {
testAndChange(array, i);
}
for(int i = 400; i < array.length - 400; i++) {
justChange(array, i);
}
}
This could be the testAndChange method
public static void testAndChange(Boolean[] pArray, int ind) {
if(pArray)
pArray[ind] = false;
}
This could be the justChange method
public static void justChange(Boolean[] pArray, int ind) {
pArray[ind] = false;
}
If we were to end up with the very rare case that every value within the range supplied to the methods were false, would there be a point where one method would eventually become slower than the other? Is there a best practice for issues similar to this?
Edit: I wanted to add this to help clarify this question a bit more. I realize that the data type can be factored into the answer as larger or more efficient datatypes can be utilized. I am more focused on the task itself. Is the task of a test "if(aConditionalTest)" is slower, faster, or indeterminable without additional informaiton (such as data type) than the task of an assignment "x=avalue".
As #TrippKinetics points out, there is a semantical difference between the two methods. Because you use Boolean instead of boolean, it is possible that one of the values is a null reference. In that case the first method (with the if-statement) will throw an exception while the second, simply assigns values to all the elements in the array.
Assuming you use boolean[] instead of Boolean[]. Optimization is an undecidable problem. There are very rare cases where adding an if-statement could result in better performance. For instance most processors use cache and the if-statement can result in the fact that the executed code is stored exactly on two cache-pages where without an if on more resulting in cache faults. Perhaps you think you will save an assignment instruction but at the cost of a fetch instruction and a conditional instruction (which breaks the CPU pipeline). Assigning has more or less the same cost as fetching a value.
In general however, one can assume that adding an if statement is useless and will nearly always result in slower code. So you can quite safely state that the if statement will slow down your code always.
More specifically on your question, there are faster ways to set a range to false. For instance using bitvectors like:
long[] data = new long[(veryLargeValue+0x3f)>>0x06];//a long has 64 bits
//assign random values
int low = 400>>0x06;
int high = (veryLargeValue-400)>>0x06;
data[low] &= 0xffffffffffffffff<<(0x3f-(400&0x3f));
for(int i = low+0x01; i < high; i++) {
data[i] = 0x00;
}
data[high] &= 0xffffffffffffffff>>(veryLargeValue-400)&0x3f));
The advantage is that a processor can perform operations on 32- or 64-bits at once. Since a boolean is one bit, by storing bits into a long or int, operations are done in parallel.

Remove the delimiter , at the end

String prefix = "";
for (String serverId : serverIds) {
sb.append(prefix);
prefix = ",";
sb.append(serverId);
}
The following code runs faster than the above code . the "," prefix object does unnecessary object creation on every iteration . The above code takes 86324 nano seconds,while mine takes only 68165 nano seconds.
List<String> l = Arrays.asList("SURESH1","SURESH2","SURESH4","SURESH5");
StringBuffer l1 = new StringBuffer();
int sz = l.size();
int i=0; long t =
System.nanoTime();
for (String s : l)
{
l1.append(s);
if ( i != sz-1)
l1.append(","); i++;
}
}
long t2 = System.nanoTime();
System.out.println ((t2-t)); System.out.println(l1);
// The time taken for the above code is 68165 nano seconds
SURESH1,SURESH2,SURESH4,SURESH5
kindly let me know which one is better in ur view.
A few points:
My code doesn't require you to know the number of elements up-front. In other words, it can work over any Iterable<String>
Why are you using StringBuffer at all rather than StringBuilder?
The "empty prefix object" is only created once... how sure are you that there aren't any references to an empty string literal anywhere in your code?
Which code do you find simpler to read? That's likely to be more important than timing in most cases. (Currently your code as posted appears not to have enough open braces, for example...)
Why not use a library method in the first place (e.g. Guava's Joiner class)?
Never use timings this small in a benchmark. How accurate do you expect your system clock to be? You should repeat the same operation many, many until it's taken a sensible amount of time.
EDIT: Now one alternative which addresses the first point would be this change:
boolean first = true;
StringBuilder builder = new StringBuilder();
for (String value : values) {
if (first) {
first = false;
} else {
builder.append(",");
}
builder.append(value);
}
Or if you really like using a counter:
int i = 0;
StringBuilder builder = new StringBuilder();
for (String value : values) {
if (i != 0) {
builder.append(",");
}
builder.append(value);
i++;
}
I also have serious doubts about the way you have coded and run your benchmarks. For a start, your timings suggest that your code didn't get JIT compiled. There are many mistakes that people make with Java benchmarks that can invalidate the results. Show us the complete code.
The other point is that in most cases, this kind of micro-optimization is irrelevant to the performance of real programs. Either the program already runs fast enough, or you are wasting your time optimizing the wrong part of the program.

Should I use Java's String.format() if performance is important?

We have to build Strings all the time for log output and so on. Over the JDK versions we have learned when to use StringBuffer (many appends, thread safe) and StringBuilder (many appends, non-thread-safe).
What's the advice on using String.format()? Is it efficient, or are we forced to stick with concatenation for one-liners where performance is important?
e.g. ugly old style,
String s = "What do you get if you multiply " + varSix + " by " + varNine + "?";
vs. tidy new style (String.format, which is possibly slower),
String s = String.format("What do you get if you multiply %d by %d?", varSix, varNine);
Note: my specific use case is the hundreds of 'one-liner' log strings throughout my code. They don't involve a loop, so StringBuilder is too heavyweight. I'm interested in String.format() specifically.
I took hhafez's code and added a memory test:
private static void test() {
Runtime runtime = Runtime.getRuntime();
long memory;
...
memory = runtime.freeMemory();
// for loop code
memory = memory-runtime.freeMemory();
I run this separately for each approach, the '+' operator, String.format and StringBuilder (calling toString()), so the memory used will not be affected by other approaches.
I added more concatenations, making the string as "Blah" + i + "Blah"+ i +"Blah" + i + "Blah".
The result are as follows (average of 5 runs each):
Approach
Time(ms)
Memory allocated (long)
+ operator
747
320,504
String.format
16484
373,312
StringBuilder
769
57,344
We can see that String + and StringBuilder are practically identical time-wise, but StringBuilder is much more efficient in memory use.
This is very important when we have many log calls (or any other statements involving strings) in a time interval short enough so the Garbage Collector won't get to clean the many string instances resulting of the + operator.
And a note, BTW, don't forget to check the logging level before constructing the message.
Conclusions:
I'll keep on using StringBuilder.
I have too much time or too little life.
I wrote a small class to test which has the better performance of the two and + comes ahead of format. by a factor of 5 to 6.
Try it your self
import java.io.*;
import java.util.Date;
public class StringTest{
public static void main( String[] args ){
int i = 0;
long prev_time = System.currentTimeMillis();
long time;
for( i = 0; i< 100000; i++){
String s = "Blah" + i + "Blah";
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
prev_time = System.currentTimeMillis();
for( i = 0; i<100000; i++){
String s = String.format("Blah %d Blah", i);
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
}
}
Running the above for different N shows that both behave linearly, but String.format is 5-30 times slower.
The reason is that in the current implementation String.format first parses the input with regular expressions and then fills in the parameters. Concatenation with plus, on the other hand, gets optimized by javac (not by the JIT) and uses StringBuilder.append directly.
All the benchmarks presented here have some flaws, thus results are not reliable.
I was surprised that nobody used JMH for benchmarking, so I did.
Results:
Benchmark Mode Cnt Score Error Units
MyBenchmark.testOld thrpt 20 9645.834 ± 238.165 ops/s // using +
MyBenchmark.testNew thrpt 20 429.898 ± 10.551 ops/s // using String.format
Units are operations per second, the more the better. Benchmark source code. OpenJDK IcedTea 2.5.4 Java Virtual Machine was used.
So, old style (using +) is much faster.
Your old ugly style is automatically compiled by JAVAC 1.6 as :
StringBuilder sb = new StringBuilder("What do you get if you multiply ");
sb.append(varSix);
sb.append(" by ");
sb.append(varNine);
sb.append("?");
String s = sb.toString();
So there is absolutely no difference between this and using a StringBuilder.
String.format is a lot more heavyweight since it creates a new Formatter, parses your input format string, creates a StringBuilder, append everything to it and calls toString().
Java's String.format works like so:
it parses the format string, exploding into a list of format chunks
it iterates the format chunks, rendering into a StringBuilder, which is basically an array that resizes itself as necessary, by copying into a new array. this is necessary because we don't yet know how large to allocate the final String
StringBuilder.toString() copies his internal buffer into a new String
if the final destination for this data is a stream (e.g. rendering a webpage or writing to a file), you can assemble the format chunks directly into your stream:
new PrintStream(outputStream, autoFlush, encoding).format("hello {0}", "world");
I speculate that the optimizer will optimize away the format string processing. If so, you're left with equivalent amortized performance to manually unrolling your String.format into a StringBuilder.
To expand/correct on the first answer above, it's not translation that String.format would help with, actually.
What String.format will help with is when you're printing a date/time (or a numeric format, etc), where there are localization(l10n) differences (ie, some countries will print 04Feb2009 and others will print Feb042009).
With translation, you're just talking about moving any externalizable strings (like error messages and what-not) into a property bundle so that you can use the right bundle for the right language, using ResourceBundle and MessageFormat.
Looking at all the above, I'd say that performance-wise, String.format vs. plain concatenation comes down to what you prefer. If you prefer looking at calls to .format over concatenation, then by all means, go with that.
After all, code is read a lot more than it's written.
In your example, performance probalby isn't too different but there are other issues to consider: namely memory fragmentation. Even concatenate operation is creating a new string, even if its temporary (it takes time to GC it and it's more work). String.format() is just more readable and it involves less fragmentation.
Also, if you're using a particular format a lot, don't forget you can use the Formatter() class directly (all String.format() does is instantiate a one use Formatter instance).
Also, something else you should be aware of: be careful of using substring(). For example:
String getSmallString() {
String largeString = // load from file; say 2M in size
return largeString.substring(100, 300);
}
That large string is still in memory because that's just how Java substrings work. A better version is:
return new String(largeString.substring(100, 300));
or
return String.format("%s", largeString.substring(100, 300));
The second form is probably more useful if you're doing other stuff at the same time.
Generally you should use String.Format because it's relatively fast and it supports globalization (assuming you're actually trying to write something that is read by the user). It also makes it easier to globalize if you're trying to translate one string versus 3 or more per statement (especially for languages that have drastically different grammatical structures).
Now if you never plan on translating anything, then either rely on Java's built in conversion of + operators into StringBuilder. Or use Java's StringBuilder explicitly.
Another perspective from Logging point of view Only.
I see a lot of discussion related to logging on this thread so thought of adding my experience in answer. May be someone will find it useful.
I guess the motivation of logging using formatter comes from avoiding the string concatenation. Basically, you do not want to have an overhead of string concat if you are not going to log it.
You do not really need to concat/format unless you want to log. Lets say if I define a method like this
public void logDebug(String... args, Throwable t) {
if(debugOn) {
// call concat methods for all args
//log the final debug message
}
}
In this approach the cancat/formatter is not really called at all if its a debug message and debugOn = false
Though it will still be better to use StringBuilder instead of formatter here. The main motivation is to avoid any of that.
At the same time I do not like adding "if" block for each logging statement since
It affects readability
Reduces coverage on my unit tests - thats confusing when you want to make sure every line is tested.
Therefore I prefer to create a logging utility class with methods like above and use it everywhere without worrying about performance hit and any other issues related to it.
I just modified hhafez's test to include StringBuilder. StringBuilder is 33 times faster than String.format using jdk 1.6.0_10 client on XP. Using the -server switch lowers the factor to 20.
public class StringTest {
public static void main( String[] args ) {
test();
test();
}
private static void test() {
int i = 0;
long prev_time = System.currentTimeMillis();
long time;
for ( i = 0; i < 1000000; i++ ) {
String s = "Blah" + i + "Blah";
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
prev_time = System.currentTimeMillis();
for ( i = 0; i < 1000000; i++ ) {
String s = String.format("Blah %d Blah", i);
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
prev_time = System.currentTimeMillis();
for ( i = 0; i < 1000000; i++ ) {
new StringBuilder("Blah").append(i).append("Blah");
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
}
}
While this might sound drastic, I consider it to be relevant only in rare cases, because the absolute numbers are pretty low: 4 s for 1 million simple String.format calls is sort of ok - as long as I use them for logging or the like.
Update: As pointed out by sjbotha in the comments, the StringBuilder test is invalid, since it is missing a final .toString().
The correct speed-up factor from String.format(.) to StringBuilder is 23 on my machine (16 with the -server switch).
Here is modified version of hhafez entry. It includes a string builder option.
public class BLA
{
public static final String BLAH = "Blah ";
public static final String BLAH2 = " Blah";
public static final String BLAH3 = "Blah %d Blah";
public static void main(String[] args) {
int i = 0;
long prev_time = System.currentTimeMillis();
long time;
int numLoops = 1000000;
for( i = 0; i< numLoops; i++){
String s = BLAH + i + BLAH2;
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
prev_time = System.currentTimeMillis();
for( i = 0; i<numLoops; i++){
String s = String.format(BLAH3, i);
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
prev_time = System.currentTimeMillis();
for( i = 0; i<numLoops; i++){
StringBuilder sb = new StringBuilder();
sb.append(BLAH);
sb.append(i);
sb.append(BLAH2);
String s = sb.toString();
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
}
}
Time after for loop 391
Time after for loop 4163
Time after for loop 227
The answer to this depends very much on how your specific Java compiler optimizes the bytecode it generates. Strings are immutable and, theoretically, each "+" operation can create a new one. But, your compiler almost certainly optimizes away interim steps in building long strings. It's entirely possible that both lines of code above generate the exact same bytecode.
The only real way to know is to test the code iteratively in your current environment. Write a QD app that concatenates strings both ways iteratively and see how they time out against each other.
Consider using "hello".concat( "world!" ) for small number of strings in concatenation. It could be even better for performance than other approaches.
If you have more than 3 strings, than consider using StringBuilder, or just String, depending on compiler that you use.

Categories

Resources