Best way to convert integers to Strings in Java [duplicate] - java

This question already has answers here:
Java - Convert integer to string [duplicate]
(6 answers)
Closed 8 years ago.
So I have seen this question, which gives several ways to convert integers to Strings, but I am wondering if there is any difference between them.
If I want to just convert an integer i to a string, then is there a difference between these three ways (and are some faster than others)?
i+""
Integer.toString(i)
String.valueOf(i)
I would be inclined to use the second or third since the first one just seems weird to me. On the other hand, if I wanted to convert an integer i to a string and then concatenate it to another string, s, I could do:
i+s
Integer.toString(i)+s
String.valueOf(i)+s
Here I would be inclined to use the first one since I am concatenating anyway. But my question is: are there any standard practices that should be used here, and what exactly are the differences (if any) among these three methods?

The option 1 ""+i is actually interpreted by the compiler as option 1b new StringBuilder("").append(i).toString().
The second option String.valueOf(i) internally calls Integer.toString(i) and therefore has an overhead of one method call.
The third option Integer.toString(i) seems to be fastest in my benchmark bellow.
In my tests (aprx. average over multiple runs):
Option 1: ~64000 ms
Option 1b: ~64000 ms (same as option 1, due to equivalent compilation)
Option 2: ~86000 ms (due to additional method call)
Option 3: ~40000 ms
The simplistic benchmark code I used:
public static void main(String[] args) {
int i = 0;
String result;
long time;
time = System.currentTimeMillis();
while (i < Integer.MAX_VALUE) {
result = ""+i;
i++;
}
System.out.println("Option 1: " + (System.currentTimeMillis() - time));
i = 0;
time = System.currentTimeMillis();
while (i < Integer.MAX_VALUE) {
result = new StringBuilder("").append(i).toString();
i++;
}
System.out.println("Option 1b: " + (System.currentTimeMillis() - time));
i = 0;
time = System.currentTimeMillis();
while (i < Integer.MAX_VALUE) {
result = String.valueOf(i);
i++;
}
System.out.println("Option 2: " + (System.currentTimeMillis() - time));
i = 0;
time = System.currentTimeMillis();
while (i < Integer.MAX_VALUE) {
result = Integer.toString(1);
i++;
}
System.out.println("Option 3: " + (System.currentTimeMillis() - time));
}
As a conclusion, at least on my JVM (JDK 1.7.0_60 64bit) the option 3: Integer.toString(i) is the fastest and I'd recommend to use it.
The conclusion that ""+i is the fastest in one of the other posts is likely due to a flawed benchmark which enabled compilation into a constant.

A simple test revealed HUGE differences between them.
Allowing for start up and cool down between runs, then for every test, running to Integer.MAX_VALUE for every option, I got these results:
""+1 took 675 millis
String.valueOf(1) took 52244 millis
Integer.toString(1) took 53205 millis
Result: Whenever possible, use i+"" or ""+1.
Sense of social duty kicking in here:
My tests were (as I indicated) ""+1, String.valueOf(1), and Integer.toString(1). When I re-run my tests I find the same results. HOWEVER, when I use variables, as indicated by ASantos and VSchäfer I get times similar between the solutions.
Whew. I feel better now.

Good answers. I've tried it (just for fun) and got similar times.
""+i is interpreted as StringBuilder("").append(i).toString() where as ""+1 is inlined.
I suspect some of the benchmarks presented here were inlined.
Based on my tests I found these times:
Option 1 (K) ""+8 : 655 milliseconds
Option 1 (var) ""+i: 83462 milliseconds
Option 2 String.valueOf(i): 90685 milliseconds
Option 3 Integer.toString(i): 88764 milliseconds
Option 1 (K) is what I suspect some tests here were using.
Option 2 is slightly better than option 4. Maybe due to some optimization in the JVM?
I am using a MacBook Pro, the JVM is 1.7.0_21
This is the source code I used:
public class Main{
public static void main(String[] args) throws InterruptedException
{
// warm up the JVM
for (int i = 0; i < 1000; i++)
{
String aString = ""+i;
}
long now = System.currentTimeMillis();
for (int i = 0; i < Integer.MAX_VALUE; i++)
{
String aString = ""+8;
}
System.out.println("Option 1 (K) \"\"+8 "+(System.currentTimeMillis()-now));
now = System.currentTimeMillis();
for (int i = 0; i < Integer.MAX_VALUE; i++)
{
String aString = ""+i;
}
System.out.println("Option 1 (var) \"\"+i "+(System.currentTimeMillis()-now));
Thread.sleep(1000);
now = System.currentTimeMillis();
for (int i = 0; i < Integer.MAX_VALUE; i++)
{
String aString = String.valueOf(i);
}
System.out.println("Option 2 String.valueOf(i) "+(System.currentTimeMillis()-now));
Thread.sleep(1000);
now = System.currentTimeMillis();
for (int i = 0; i < Integer.MAX_VALUE; i++)
{
String aString = Integer.toString(i);
}
System.out.println("Option 3 Integer.toString(i) "+(System.currentTimeMillis()-now));
}
}

Differences:
String to int
//overhead as you are appending a empty string with an int
""+i
//More generic (can be the same for float, double, long & int)
String.valueOf(i)
//Type Safe you know for sure that what you are converting is an Integer.
Integer.toString(i)
No, there are no standard way it all depends on what you are trying to accomplish, if you want your code to be more robust you might try the valueOf(), if you want to be certain that it is in fact a Integer than the second (the first is not recommended).
There is no significant performance difference between these 3.
The other way is analog to the comments of the String to int.

i+""; Tends to be computationally faster than the other two but as far as industry standards go, it tends to be frowned upon. I personally would use String.valueOf(i) but I believe Integer.toString(i) is also acceptable. i + "S" in the case of concatenation is also fine.

String.valueOf(i);
actually calls
Integer.toString(i);
One more thing if you invoke toString() in a null object you will get a NullPointerException
Moreover the static valueOf() of String class takes different primitive parameters and offers more flexibility

The source code of String.valueOf(int) is:
public static String valueOf(int i) {
return Integer.toString(i);
}
Since the String class is final, this call will probably be inlined by the JIT.
In the case of i + "" it will create a StringBuilder and call its append(int) method, which ultimately results in a call to Integer.getChars, the same method that Integer.toString uses to get the string representation of the integer. It also calls the append(String) method to append the empty string, and then toString to get the resulting String from the StringBuilder.

Related

Which solution is better, faster and readable? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
ok, problem: "An isogram is a word that has no repeating letters, consecutive or non-consecutive. Implement a function that determines whether a string that contains only letters is an isogram. Assume the empty string is an isogram. Ignore letter case."
and I write 2 solutions
First solution:
Scanner scanner = new Scanner(System.in);
String string = scanner.next();
long start = System.currentTimeMillis();
char words[] = string.toCharArray();
boolean isIsogram=true;
for (int i=(words.length-1); i>=0; i--){
for(int j=0; j<(i-1);j++){
if(words[i]==words[j]){
isIsogram=false;
}
}
}
long finish = System.currentTimeMillis();
System.out.println(isIsogram + " time:"+ (finish-start) );
Second solution:
Scanner scanner = new Scanner(System.in);
String string = scanner.next();
long start = System.currentTimeMillis();
boolean isIsogram = (string.length() == string.toLowerCase().chars().distinct().count());
long finish = System.currentTimeMillis();
System.out.println(isIsogram + " time:"+ (finish-start) );
I have tested both solutions and there is results:
input: "asd"
1) true time 0
2) true time 113
and I want to know your ideas and opinion which solution is better?
My teacher told me 2 solution is better, but 1 solution takes
less time, and I am not sure which is better.....
The right answer is actually to profile your specific problem and see what works for your requirements.
Readability is important. What approach is most readable is very subjective. Stream-oriented operations usually attack the problem from a more declarative approach rather than an imperative approach. Declarative code is usually much easier to read, but imperative code is often faster.
But how fast do you need to be? Even your (very flawed) benchmark shows only a difference of 100 milliseconds. That's faster than the threshold of human perception. If your code isn't too slow, then don't worry about making it faster. Worry about making it clear, maintainable, debuggable, and correct first.
In any case, since this is a fun problem, I poked at it for a minute. You have 216 possible char values in a String, so if you use a BitSet you can have a yes/no bit for each of them, and still fit the whole thing in 8K of memory.
Your question said to do case folding. That sounds like a simplification, but it's really not, unless your data is ASCII (in which case, you only need a 256-bit BitSet, or possibly only a 26-bit one!). If you can use the full range of Unicode characters, even the problem of reliably converting from upper case to lower case becomes almost impossible to do correctly. (Case conversion is ultimately locale-specific.)
So I'm going to assume you want to handle all possible char values, which won't handle UTF-16 surrogates (like you need for emoji) correctly, but should handle everything that's considered a "letter" in alphabetic languages.
Here's what I came up with:
static boolean isIsogram(String text) {
java.util.BitSet bits = new java.util.BitSet(1 << 16);
for (int i = 0; i < text.length; i++) {
int ch = (int) text.charAt(i);
if (bits.get(ch)) {
return false;
}
bits.set(ch);
}
return true;
}
A few things on readability:
First codeblock:
I would have the counters counting in the same direction-- it will still compare each word this way. It's not a terribly important change, but it can save the reader a step so that they don't have to do any mental math to determine if the code is producing the intended result, since the result is apparent (it's easy to see that the code's time complexity is O(n^2)).
boolean isIsogram = true;
Scanner scanner = new Scanner(System.in);
String string = scanner.next();
long start = System.currentTimeMillis();
char words[] = string.toCharArray();
for (int i = 0 ; i <= words.length - 1; i++){
for(int j = 0; j <= words.length - 1; j++){
if(words[i] == words[j]){
isIsogram = false;
break; //I agree with the first answer
}
}
if (!isIsogram)
break;
}
long finish = System.currentTimeMillis();
System.out.println(isIsogram + " time:" + (finish-start) );
The second codeblock is quite readable, although I may be primed towards understanding the problem and so it might actually be more readable because of that. But the calls to compare distinct characters make complete sense in terms of the goal.

How to generate 1000 unique email-ids using java

My requirement is to generate 1000 unique email-ids in Java. I have already generated random Text and using for loop I'm limiting the number of email-ids to be generated. Problem is when I execute 10 email-ids are generated but all are same.
Below is the code and output:
public static void main() {
first fr = new first();
String n = fr.genText()+"#mail.com";
for (int i = 0; i<=9; i++) {
System.out.println(n);
}
}
public String genText() {
String randomText = "abcdefghijklmnopqrstuvwxyz";
int length = 4;
String temp = RandomStringUtils.random(length, randomText);
return temp;
}
and output is:
myqo#mail.com
myqo#mail.com
...
myqo#mail.com
When I execute the same above program I get another set of mail-ids. Example: instead of 'myqo' it will be 'bfta'. But my requirement is to generate different unique ids.
For Example:
myqo#mail.com
bfta#mail.com
kjuy#mail.com
Put your String initialization in the for statement:
for (int i = 0; i<=9; i++) {
String n = fr.genText()+"#mail.com";
System.out.println(n);
}
I would like to rewrite your method a little bit:
public String generateEmail(String domain, int length) {
return RandomStringUtils.random(length, "abcdefghijklmnopqrstuvwxyz") + "#" + domain;
}
And it would be possible to call like:
generateEmail("gmail.com", 4);
As I understood, you want to generate unique 1000 emails, then you would be able to do this in a convenient way by Stream API:
Stream.generate(() -> generateEmail("gmail.com", 4))
.limit(1000)
.collect(Collectors.toSet())
But the problem still exists. I purposely collected a Stream<String> to a Set<String> (which removes duplicates) to find out its size(). As you may see, the size is not always equals 1000
999
1000
997
that means your algorithm returns duplicated values even for such small range.
Therefore, you'd better research already written email generators for Java or improve your own (for example, by adding numbers, some special characters that, in turn, will generate a plenty of exceptions).
If you are planning to use MockNeat, the feature for implementing email strings is already implemented.
Example 1:
String corpEmail = mock.emails().domain("startup.io").val();
// Possible Output: tiptoplunge#startup.io
Example 2:
String domsEmail = mock.emails().domains("abc.com", "corp.org").val();
// Possible Output: funjulius#corp.org
Note: mock is the default "mocking" object.
To guarantee uniqueness you could use a counter as part of the email address:
myqo0000#mail.com
bfta0001#mail.com
kjuy0002#mail.com
If you want to stick to letters only then convert the counter to base 26 representation using 'a' to 'z' as the digits.

Is there a performance impact when using Guava.Preconditions with concatenated strings?

In our code, we often check arguments with Preconditions:
Preconditions.checkArgument(expression, "1" + var + "3");
But sometimes, this code is called very often. Could this have a notable negative impact on performance? Should we switch to
Preconditions.checkArgument(expression, "%s%s%s", 1, var, 3);
?
(i expect the condition true most of the time. False means bug.)
If you expect the check to not throw any exception most of the time, there is no reason to use the string concatenation. You'll lose more time concatenating (using .concat or a StringBuilder) before calling the method than doing it after you're sure you're throwing an exception.
Reversely, if you're throwing an exception, you're already in the slow branch.
It's also noteworthy to mention that Guava uses a custom and faster formatter which accepts only %s. So the loss of time is actually more similar to the standard logger {} handle (in slf4j or log4j 2). But as written above, this is in the case you're already in the slow branch.
In any case, I would strongly recommend against any of your suggestion, but I'd use this one instead:
Preconditions.checkArgument(expression, "1%s3", var);
You should only put variables in %s, not constants to gain marginal speed.
In the case of String literal concatenation, the compiler should do this in compilation time, so no runtime performance hit will occur. At least the standard JDK does this, it is not per specification (so some compilers may not optimize this).
In the case of variables, constant folding won't work, so there will be work in runtime. However, newer Java compilers will replace string concatenation to StringBuilder, which should be more efficient, as it is not immutable, unlike String.
This should be faster than using a formatter, if it is called. However, if you don't except it to be called very often, then this can be slower, as the concatenation always happen, even if the argument is true, and the method does nothing.
Anyway, to wrap it up: I do not think that it is worth to rewrite the existing calls. However, in new code, you can use the formatter without doubts.
I wrote a simple test. Using formatter is much faster as suggested here. The difference in performance grows with the number of calls (performance with formatter does not change O(1)). I guess the garbage collector time grows with number of calls in case of using simple strings.
Here is one sample result:
started with 10000000 calls and 100 runs
formatter: 0.94 (mean per run)
string: 181.11 (mean per run)
Formatter is 192.67021 times faster. (this difference grows with number of calls)
Here is the code (Java 8, Guava 18):
import java.util.concurrent.TimeUnit;
import java.util.function.Consumer;
import com.google.common.base.Preconditions;
import com.google.common.base.Stopwatch;
public class App {
public static void main(String[] args) {
int count = 10000000;
int runs = 100;
System.out.println("started with " + count + " calls and " + runs + "runs");
Stopwatch stopwatch = Stopwatch.createStarted();
run(count, runs, i->fast(i));
stopwatch.stop();
float fastTime = (float)stopwatch.elapsed(TimeUnit.MILLISECONDS)/ runs;
System.out.println("fast: " + fastTime + " (mean per run)");
//
stopwatch.reset();
System.out.println("reseted: "+stopwatch.elapsed(TimeUnit.MILLISECONDS));
stopwatch.start();
run(count, runs, i->slow(i));
stopwatch.stop();
float slowTime = (float)stopwatch.elapsed(TimeUnit.MILLISECONDS)/ runs;
System.out.println("slow: " + slowTime + " (mean per run)");
float times = slowTime/fastTime;
System.out.println("Formatter is " + times + " times faster." );
}
private static void run(int count, int runs, Consumer<Integer> function) {
for(int c=0;c<count;c++){
for(int r=0;r<runs;r++){
function.accept(r);
}
}
}
private static void slow(int i) {
Preconditions.checkArgument(true, "var was " + i);
}
private static void fast(int i) {
Preconditions.checkArgument(true, "var was %s", i);
}
}

Why this method does not get optimized away?

This Java method gets used in benchmarks for simulating slow computation:
static int slowItDown() {
int result = 0;
for (int i = 1; i <= 1000; i++) {
result += i;
}
return result;
}
This is IMHO a very bad idea, as its body can get replaced by return 500500. This seems to never happen1; probably because of such an optimization being irrelevant for real code as Jon Skeet stated.
Interestingly, a slightly simpler method with result += 1; gets fully optimized away (caliper reports 0.460543 ns).
But even when we agree that optimizing away methods returning a constant result is useless for real code, there's still loop unrolling, which could lead to something like
static int slowItDown() {
int result = 0;
for (int i = 1; i <= 1000; i += 2) {
result += 2 * i + 1;
}
return result;
}
So my question remains: Why is no optimization performed here?
1Contrary to what I wrote originally; I must have seen something what wasn't there.
Well, the JVM does optimize away such code. The question is how many times it has to be detected as a real hotspot (benchmarks do some more than this single method, usually) before it will be analyzed this way. In my setup it required 16830 invocations before the execution time went to (almost) zero.
It’s correct that such a code does not appear in real code. However it might remain after several inlining operations of other hotspots dealing with values not being compiling-time constants but runtime constants or de-facto constants (values that could change in theory but don’t practically). When such a piece of code remains it’s a great benefit to optimize it away entirely but that is not expected to happen soon, i.e. when calling right from the main method.
Update: I simplified the code and the optimization came even earlier.
public static void main(String[] args) {
final int inner=10;
final float innerFrac=1f/inner;
int count=0;
for(int j=0; j<Integer.MAX_VALUE; j++) {
long t0=System.nanoTime();
for(int i=0; i<inner; i++) slowItDown();
long t1=System.nanoTime();
count+=inner;
final float dt = (t1-t0)*innerFrac;
System.out.printf("execution time: %.0f ns%n", dt);
if(dt<10) break;
}
System.out.println("after "+count+" invocations");
System.out.println(System.getProperty("java.version"));
System.out.println(System.getProperty("java.vm.version"));
}
static int slowItDown() {
int result = 0;
for (int i = 1; i <= 1000; i++) {
result += i;
}
return result;
}
…
execution time: 0 ns
after 15300 invocations
1.7.0_13
23.7-b01
(64Bit Server VM)

Should I use Java's String.format() if performance is important?

We have to build Strings all the time for log output and so on. Over the JDK versions we have learned when to use StringBuffer (many appends, thread safe) and StringBuilder (many appends, non-thread-safe).
What's the advice on using String.format()? Is it efficient, or are we forced to stick with concatenation for one-liners where performance is important?
e.g. ugly old style,
String s = "What do you get if you multiply " + varSix + " by " + varNine + "?";
vs. tidy new style (String.format, which is possibly slower),
String s = String.format("What do you get if you multiply %d by %d?", varSix, varNine);
Note: my specific use case is the hundreds of 'one-liner' log strings throughout my code. They don't involve a loop, so StringBuilder is too heavyweight. I'm interested in String.format() specifically.
I took hhafez's code and added a memory test:
private static void test() {
Runtime runtime = Runtime.getRuntime();
long memory;
...
memory = runtime.freeMemory();
// for loop code
memory = memory-runtime.freeMemory();
I run this separately for each approach, the '+' operator, String.format and StringBuilder (calling toString()), so the memory used will not be affected by other approaches.
I added more concatenations, making the string as "Blah" + i + "Blah"+ i +"Blah" + i + "Blah".
The result are as follows (average of 5 runs each):
Approach
Time(ms)
Memory allocated (long)
+ operator
747
320,504
String.format
16484
373,312
StringBuilder
769
57,344
We can see that String + and StringBuilder are practically identical time-wise, but StringBuilder is much more efficient in memory use.
This is very important when we have many log calls (or any other statements involving strings) in a time interval short enough so the Garbage Collector won't get to clean the many string instances resulting of the + operator.
And a note, BTW, don't forget to check the logging level before constructing the message.
Conclusions:
I'll keep on using StringBuilder.
I have too much time or too little life.
I wrote a small class to test which has the better performance of the two and + comes ahead of format. by a factor of 5 to 6.
Try it your self
import java.io.*;
import java.util.Date;
public class StringTest{
public static void main( String[] args ){
int i = 0;
long prev_time = System.currentTimeMillis();
long time;
for( i = 0; i< 100000; i++){
String s = "Blah" + i + "Blah";
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
prev_time = System.currentTimeMillis();
for( i = 0; i<100000; i++){
String s = String.format("Blah %d Blah", i);
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
}
}
Running the above for different N shows that both behave linearly, but String.format is 5-30 times slower.
The reason is that in the current implementation String.format first parses the input with regular expressions and then fills in the parameters. Concatenation with plus, on the other hand, gets optimized by javac (not by the JIT) and uses StringBuilder.append directly.
All the benchmarks presented here have some flaws, thus results are not reliable.
I was surprised that nobody used JMH for benchmarking, so I did.
Results:
Benchmark Mode Cnt Score Error Units
MyBenchmark.testOld thrpt 20 9645.834 ± 238.165 ops/s // using +
MyBenchmark.testNew thrpt 20 429.898 ± 10.551 ops/s // using String.format
Units are operations per second, the more the better. Benchmark source code. OpenJDK IcedTea 2.5.4 Java Virtual Machine was used.
So, old style (using +) is much faster.
Your old ugly style is automatically compiled by JAVAC 1.6 as :
StringBuilder sb = new StringBuilder("What do you get if you multiply ");
sb.append(varSix);
sb.append(" by ");
sb.append(varNine);
sb.append("?");
String s = sb.toString();
So there is absolutely no difference between this and using a StringBuilder.
String.format is a lot more heavyweight since it creates a new Formatter, parses your input format string, creates a StringBuilder, append everything to it and calls toString().
Java's String.format works like so:
it parses the format string, exploding into a list of format chunks
it iterates the format chunks, rendering into a StringBuilder, which is basically an array that resizes itself as necessary, by copying into a new array. this is necessary because we don't yet know how large to allocate the final String
StringBuilder.toString() copies his internal buffer into a new String
if the final destination for this data is a stream (e.g. rendering a webpage or writing to a file), you can assemble the format chunks directly into your stream:
new PrintStream(outputStream, autoFlush, encoding).format("hello {0}", "world");
I speculate that the optimizer will optimize away the format string processing. If so, you're left with equivalent amortized performance to manually unrolling your String.format into a StringBuilder.
To expand/correct on the first answer above, it's not translation that String.format would help with, actually.
What String.format will help with is when you're printing a date/time (or a numeric format, etc), where there are localization(l10n) differences (ie, some countries will print 04Feb2009 and others will print Feb042009).
With translation, you're just talking about moving any externalizable strings (like error messages and what-not) into a property bundle so that you can use the right bundle for the right language, using ResourceBundle and MessageFormat.
Looking at all the above, I'd say that performance-wise, String.format vs. plain concatenation comes down to what you prefer. If you prefer looking at calls to .format over concatenation, then by all means, go with that.
After all, code is read a lot more than it's written.
In your example, performance probalby isn't too different but there are other issues to consider: namely memory fragmentation. Even concatenate operation is creating a new string, even if its temporary (it takes time to GC it and it's more work). String.format() is just more readable and it involves less fragmentation.
Also, if you're using a particular format a lot, don't forget you can use the Formatter() class directly (all String.format() does is instantiate a one use Formatter instance).
Also, something else you should be aware of: be careful of using substring(). For example:
String getSmallString() {
String largeString = // load from file; say 2M in size
return largeString.substring(100, 300);
}
That large string is still in memory because that's just how Java substrings work. A better version is:
return new String(largeString.substring(100, 300));
or
return String.format("%s", largeString.substring(100, 300));
The second form is probably more useful if you're doing other stuff at the same time.
Generally you should use String.Format because it's relatively fast and it supports globalization (assuming you're actually trying to write something that is read by the user). It also makes it easier to globalize if you're trying to translate one string versus 3 or more per statement (especially for languages that have drastically different grammatical structures).
Now if you never plan on translating anything, then either rely on Java's built in conversion of + operators into StringBuilder. Or use Java's StringBuilder explicitly.
Another perspective from Logging point of view Only.
I see a lot of discussion related to logging on this thread so thought of adding my experience in answer. May be someone will find it useful.
I guess the motivation of logging using formatter comes from avoiding the string concatenation. Basically, you do not want to have an overhead of string concat if you are not going to log it.
You do not really need to concat/format unless you want to log. Lets say if I define a method like this
public void logDebug(String... args, Throwable t) {
if(debugOn) {
// call concat methods for all args
//log the final debug message
}
}
In this approach the cancat/formatter is not really called at all if its a debug message and debugOn = false
Though it will still be better to use StringBuilder instead of formatter here. The main motivation is to avoid any of that.
At the same time I do not like adding "if" block for each logging statement since
It affects readability
Reduces coverage on my unit tests - thats confusing when you want to make sure every line is tested.
Therefore I prefer to create a logging utility class with methods like above and use it everywhere without worrying about performance hit and any other issues related to it.
I just modified hhafez's test to include StringBuilder. StringBuilder is 33 times faster than String.format using jdk 1.6.0_10 client on XP. Using the -server switch lowers the factor to 20.
public class StringTest {
public static void main( String[] args ) {
test();
test();
}
private static void test() {
int i = 0;
long prev_time = System.currentTimeMillis();
long time;
for ( i = 0; i < 1000000; i++ ) {
String s = "Blah" + i + "Blah";
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
prev_time = System.currentTimeMillis();
for ( i = 0; i < 1000000; i++ ) {
String s = String.format("Blah %d Blah", i);
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
prev_time = System.currentTimeMillis();
for ( i = 0; i < 1000000; i++ ) {
new StringBuilder("Blah").append(i).append("Blah");
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
}
}
While this might sound drastic, I consider it to be relevant only in rare cases, because the absolute numbers are pretty low: 4 s for 1 million simple String.format calls is sort of ok - as long as I use them for logging or the like.
Update: As pointed out by sjbotha in the comments, the StringBuilder test is invalid, since it is missing a final .toString().
The correct speed-up factor from String.format(.) to StringBuilder is 23 on my machine (16 with the -server switch).
Here is modified version of hhafez entry. It includes a string builder option.
public class BLA
{
public static final String BLAH = "Blah ";
public static final String BLAH2 = " Blah";
public static final String BLAH3 = "Blah %d Blah";
public static void main(String[] args) {
int i = 0;
long prev_time = System.currentTimeMillis();
long time;
int numLoops = 1000000;
for( i = 0; i< numLoops; i++){
String s = BLAH + i + BLAH2;
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
prev_time = System.currentTimeMillis();
for( i = 0; i<numLoops; i++){
String s = String.format(BLAH3, i);
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
prev_time = System.currentTimeMillis();
for( i = 0; i<numLoops; i++){
StringBuilder sb = new StringBuilder();
sb.append(BLAH);
sb.append(i);
sb.append(BLAH2);
String s = sb.toString();
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);
}
}
Time after for loop 391
Time after for loop 4163
Time after for loop 227
The answer to this depends very much on how your specific Java compiler optimizes the bytecode it generates. Strings are immutable and, theoretically, each "+" operation can create a new one. But, your compiler almost certainly optimizes away interim steps in building long strings. It's entirely possible that both lines of code above generate the exact same bytecode.
The only real way to know is to test the code iteratively in your current environment. Write a QD app that concatenates strings both ways iteratively and see how they time out against each other.
Consider using "hello".concat( "world!" ) for small number of strings in concatenation. It could be even better for performance than other approaches.
If you have more than 3 strings, than consider using StringBuilder, or just String, depending on compiler that you use.

Categories

Resources