I am Android developer and not new to Java but I have some questions about best practices for performace. Ill give some examples from my code so you can decide.
String concatenation
url = "http://www.myserver." + domain + "/rss.php?"
+ rawType + rawCathegory + rawSubCathegory + rawLocality
+ rawRadius + rawKeyword + rawPriceFrom + rawPriceto;
As far as I know, this would create 11 string objects before my url variable is created, right?
Ive been taught to use StringBuilder, but my question is, whats the minimum amount of strings to concat to make it efficient? I think it wouldnt make much sense to use it concat two strings, right?
Local variables
Sometimes I try to "chain" method calls like so
FilterData.getInstance(context).getFilter(position).setActivated(isActivated);
to naively avoid variable allocation, but is it any faster than this?
FilterData filterData = FilterData.getInstance(context);
Filter filter = filterData.getFilter(position);
filter.setActivated(isActivated);
I believe it should as I save myself a local variable, but it becomes unreadable if the method names are long, etc.
Loops
http://developer.android.com/training/articles/perf-tips.html says that enhanced for loops is 3x faster than the regular for loop, well that great and its easier to write anyways, but, what if I need the index? As far as I know, in enhaced for loop I need to keep track of it myself, like this
int index = 0;
for(Object obj : objects) {
// do stuff
index++;
}
Is this still faster than the regular loop?
for(int i = 0; i < objects.size(); i++) {
// do stuff
}
I think that enhanced for loop maybe does optimisations about that limit, so maybe if the size() got optimized to this
int size = objects.size();
for(int i = 0; i < size; i++) {
// do stuff
}
How would that stand?
Thanks, I know this might be nitpicking and not make that much of a difference, but Ill rather learn such common tasks the right way.
Strings:
Unless there's a loop involved, the compiler is clever enough to do the concatenation for you in the best way.
When you're looping, use StringBuilder or Buffer.
Local Variables:
The two examples you give are identical. The memory still needs to be allocated even if you never give it a name.
Loops:
Depending on the type of loop, using enhanced loops can give a massive or negligible improvement, it's best to read up on the one you're using.
Related
In this answer, it says (implies) that String concatenation is optimised into StringBuilder operations anyway, so when I write my code, is there any reason to write StringBuilder code in the source? Note that my use case is different from the OP's question, as I am concatenating/appending hundreds-thousands of lines.
To make myself clearer: I am well-aware about the differences of each, it's just that I don't know if it's worth actually writing StringBuilder code because it's less readable and when its supposedly slower cousin, the String class, is converted automagically in the compilation process anyway.
I think the use of StringBuilder vs + really depends on the context you are using it in.
Generally using JDK 1.6 and above the compiler will automatically join strings together using StringBuilder.
String one = "abc";
String two = "xyz";
String three = one + two;
This will compile String three as:
String three = new StringBuilder().append(one).append(two).toString();
This is quite helpful and saves us some runtime. However this process is not always optimal. Take for example:
String out = "";
for( int i = 0; i < 10000 ; i++ ) {
out = out + i;
}
return out;
If we compile to bytecode and then decompile the bytecode generated we get something like:
String out = "";
for( int i = 0; i < 10000; i++ ) {
out = new StringBuilder().append(out).append(i).toString();
}
return out;
The compiler has optimised the inner loop but certainly has not made the best possible optimisations. To improve our code we could use:
StringBuilder out = new StringBuilder();
for( int i = 0 ; i < 10000; i++ ) {
out.append(i);
}
return out.toString();
Now this is more optimal than the compiler generated code, so there is definitely a need to write code using the StringBuilder/StringBuffer classes in cases where efficient code is needed. The current compilers are not great at dealing concatenating strings in loops, however this could change in the future.
You need to look carefully to see where you need to manually apply StringBuilder and try to use it where it will not reduce readability of your code too.
Note: I compiled code using JDK 1.6, and and decompiled the code using the javap program, which spits out byte code. It is fairly easy to interpret and is often a useful reference to look at when trying to optimise code. The compiler does change you code behind the scenes so it is always interesting to see what it does!
The key phrase in your question is "supposedly slower". You need to identify if this is indeed a bottleneck, and then see which is faster.
If you are about to write this code, but have not written it yet, then write whatever is clearer to you and then if necessay see if it is a bottleneck.
While it makes sense to use the code you consider more likey to be faster, if both are equally readable, actually taking time to find out which is faster when you don't have a need is a waste of time. Readability above performance until performance is unacceptable.
It depends on the case, but StringBuilder is thought to be a bit faster. If you are doing concatenation inside a loop then I would suggest you use StringBuilder.
Anyway, I would advise you to profile and benchmark your code (if you are doing such a massive append).
Be careful though: instances of StringBuilder are mutable and are not to be shared between threads (unless you really know what you are doing.) as opposed to String which are immutable.
I might've misunderstood your question, but StringBuilder is faster when appending Strings. So, yes, if you are appending "hundreds-thousands of lines", you definitely should use StringBuilder (or StringBuffer if you are running a multithreaded app).
(More thorough answer in comments)
I have a Double that I want to knock the extra digits after the decimal place off of (I'm not too concerned about accuracy but feel free to mention it in your answer) prior to conversion into a String.
I was wondering whether it would be better to cast to an int or to use a DecimalFormat and call format(..) . Also, is it then more efficient to specify String.valueOf() or leave it as it is and let the compiler figure it out?
Sorry if I sound a bit ignorant, I'm genuinely curious to learn more of the technical details.
For reference, i'm drawing text to and android canvas:
c.drawText("FPS: " + String.valueOf((int)lastFps), xPos, yPos, paint);
Casting will probably be more efficient. This is implemented as native code while using a method will have to go through the java code. Also it's much more readable.
For the string.valueof, I expect the performance to be strictly the same. I find it more readable to just do "string" + intValue than "string" + String.valueof(intValue)
I made a program that used System.nanoTime() to calculate the execution time of these two methods:
public static void cast() {
for (int i=0; i<1000000; i++) {
int x= (int)Math.random();
}
}
public static void format() {
for (int i=0; i< 1000000; i++) {
DecimalFormat df = new DecimalFormat("#");
df.format(Math.random());
}
}
Here are the respective results:
80984944
6048075593
Granted my tests probably aren't perfect examples. I'm just using math.random(), which generates a number that will always cast to 0, which might affect results. However, these results do make sense - casting should be cheap, since it likely doesn't operate on the bits at all - the JVM just treats the bits differently.
Edit: If I pull out the instantiation of the formatter for the second example, the program runs in 3155165182ns. If I multiply the random numbers by Integer.MAX_VALUE in both cases (with the instantiation pulled out), the results are: 82100170 and 4174558079. Looks like casting is the way to go.
This is a job for Math.floor().
Generally speaking, function/method calls come at the cost of performance overhead. My vote is that typecasting would be faster, but as #Zefiryn suggested, the best way is to create a loop and do each action a multitude of times and measure the performance with a timer.
I'm not sure about the efficiency of either, but here's a third option that could be interesting to compare:
String.valueOf(doubleValue).substring(0, endInt)
which would give a set number of characters rather than decimals/numbers, and would skip the typecasting but make two function calls instead.
EDIT: Was too curious so I tried running each option:
integerNumberFormat.format(number)
String.valueOf(doubleValue).substring(0, endInt)
String.valueOf((int)doubleValue)
10^6 cycles with the results being ~800 ms, ~300 ms and ~40 ms, respectively. I guess my results won't be immediately translatable to your situation but they could give a hint that the last one is indeed, as the previous posters suggested, the fastest one.
This question already has answers here:
Is there a performance difference between a for loop and a for-each loop?
(16 answers)
Closed 7 years ago.
(A question for those who know well the JVM compilation and optimization tricks... :-)
Is there any of the "for" and "foreach" patterns clearly superior to the other?
Consider the following two examples:
public void forLoop(String[] text)
{
if (text != null)
{
for (int i=0; i<text.length; i++)
{
// Do something with text[i]
}
}
}
public void foreachLoop(String[] text)
{
if (text != null)
{
for (String s : text)
{
// Do something with s, exactly as with text[i]
}
}
}
Is forLoop faster or slower than foreachLoop?
Assuming that in both cases the text array did not need any do sanity checks, is there a clear winner or still too close to make a call?
EDIT: As noted in some of the answers, the performance should be identical for arrays, whereas the "foreach" pattern could be slightly better for Abstract Data Types like a List. See also this answer which discusses the subject.
From section 14.14.2 of the JLS:
Otherwise, the Expression necessarily has an array type, T[]. Let L1 ... Lm be the (possibly empty) sequence of labels immediately preceding the enhanced for statement. Then the meaning of the enhanced for statement is given by the following basic for statement:
T[] a = Expression;
L1: L2: ... Lm:
for (int i = 0; i < a.length; i++) {
VariableModifiersopt Type Identifier = a[i];
Statement
}
In other words, I'd expect them to end up being compiled to the same code.
There's definitely a clear winner: the enhanced for loop is more readable. That should be your primary concern - you should only even consider micro-optimizing this sort of thing when you've proved that the most readable form doesn't perform as well as you want.
You can write your own simple test, which measure the execution time.
long start = System.currentTimeMillis();
forLoop(text);
long end = System.currentTimeMillis();
long result = end - start;
result is execution time.
Since you are using an array type, the performance difference wouldn't matter. They would end up giving the same performance after going through the optimization funnel.
But if you are using ADTs like List, then the forEachLoop is obviously the best choice compared to multiple get(i) calls.
You should choose the option which is more readable almost every time, unless you know you have a performance issue.
In this case, I would say they are guaranteed to be the same.
The only difference is you extra check for text.length which is likely to be slower, rather than faster.
I would also ensure text is never null statically. e.g. using an #NotNull annotation. Its better to catch these issues at compile/build time (and it would be faster)
There is no performance penalty for using the for-each loop, even for arrays. In fact, it may offer a slight performance advantage over an ordinary for loop in some circumstances, as it computes the limit of the array index only once. For details follow this post.
I had a challenge to print out multiples of 7 (non-negative) to the 50th multiple in the simplest way humanly possible using for loops.
I came up with this (Ignoring the data types)
for(int i = 0; i <= 350; i += 7)
{System.out.println(i);}
The other guy came up with this
for(int i=0;i <=50; i++)
{
System.out.println(7*i);
}
However, I feel the two code snippets could be further optimized. If it actually can please tell. And what are the advantages/disadvantages of one over the other?
If you really want to optimize it, do this:
System.out.print("0\n7\n14\n21\n28\n35\n42\n49\n56\n63\n70\n77\n84\n91\n98\n105\n112\n119\n126\n133\n140\n147\n154\n161\n168\n175\n182\n189\n196\n203\n210\n217\n224\n231\n238\n245\n252\n259\n266\n273\n280\n287\n294\n301\n308\n315\n322\n329\n336\n343\n350");
and it's O(1) :)
The first one technically performs less operations (no multiplication).
The second one is slightly more readable (50 multiples of 7 vs. multiples of 7 up to 350).
Probably can't be optimized any further.
Unless you're willing to optimize away multiple println calls by doing:
StringBuilder s = new StringBuilder();
for(int i = 0; i <= 350; i += 7) s.append(i).append(", ");
System.out.println(s.toString());
(IIRC printlns are relatively expensive.)
This is getting to the point where you gain a tiny bit of optimization at the expense of simplicity.
In theory, your code is faster since it does not need one less multiplication instruction per loop.
However, the multiple calls to System.out.println (and the integer-to-string conversion) will dwarf the runtime the multiplication takes. To optimize, aggregate the Strings with a StringBuilder and output the whole result (or output the result when memory becomes a problem).
However, in real-world code, this is extremely unlikely to be the bottleneck. Profile, then optimize.
The second function is the best you would get:
O(n)
I wrote some code that looks similar to the following:
String SKIP_FIRST = "foo";
String SKIP_SECOND = "foo/bar";
int skipFooBarIndex(String[] list){
int index;
if (list.length >= (index = 1) && list[0].equals(SKIP_FIRST) ||
list.length >= (index = 2) &&
(list[0] + "/" + list[1]).equals(SKIP_SECOND)){
return index;
}
return 0;
}
String[] myArray = "foo/bar/apples/peaches/cherries".split("/");
print(skipFooBarIndex(myArray);
This changes state inside of the if statement by assigning index. However, my coworkers disliked this very much.
Is this a harmful practice? Is there any reason to do it?
Yes. This clearly reduces readability. What's wrong with the following code?
int skipFooBarIndex(String[] list){
if(list.length >= 1 && list[0].equals(SKIP_FIRST))
return 1;
if(list.length >= 2 && (list[0] + "/" + list[1]).equals(SKIP_SECOND))
return 2;
return 0;
}
It's much easier to understand. In general, having side effects in expressions is discouraged as you'll be relying on the order of evaluation of subexpressions.
Assuming you count it as "clever" code, it's good to always remember Brian Kernighan's quote:
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
...However, my coworkers disliked this very much...
Yes, it is. Not just because you can code it like that, you have to.
Remember that that piece of code will eventually have to be maintained by someone ( that someone may be your self in 8 months )
Changing the state inside the if, make is harder to read and understand ( mostly because it is non common )
Quoting Martin Fowler:
Any fool can write code that a computer can understand. Good programmers write code that humans can understand
There's an excellent reason not to do it: it's makes your code really hard to understand and reason about.
The problem is that the code would generate multiple-WTFs in a code review session. Anything that makes people go "wait, what?" has got to go.
It's sadly easy enough to create bugs even in easy-to-read code. No reason to make it even easier.
Yes, side effects are hard to follow when reviewing code.
Regarding reasons to do it: No, there is no real reason to do it. I haven't yet stumbled upon an if statement that can't be rewritten without side effects without having any loss.
The only thing wrong with it is that it's unfamiliar and confusing to people who didn't write it, at least for a minute while they figure it out. I would probably write it like this to make it more readable:
if (list.length >= 1 && list[0].equals(SKIP_FIRST)) {
return 1;
}
if (list.length >= 2 && (list[0] + "/" + list[1]).equals(SKIP_SECOND)) {
return 2;
}
Borrowed from cppreference.com:
One important aspect of C++ that is related to operator precedence is the order of evaluation and the order of side effects in expressions. In some circumstances, the order in which things happen is not defined. For example, consider the following code:
float x = 1;
x = x / ++x;
The value of x is not guaranteed to be consistent across different compilers, because it is not clear whether the computer should evaluate the left or the right side of the division first. Depending on which side is evaluated first, x could take a different value.
Furthermore, while ++x evaluates to x+1, the side effect of actually storing that new value in x could happen at different times, resulting in different values for x.
The bottom line is that expressions like the one above are horribly ambiguous and should be avoided at all costs. When in doubt, break a single ambiguous expression into multiple expressions to ensure that the order of evaluation is correct.
Is this a harmful practice?
Absolutely yes. The code is hard to understand. It takes two or three reads for anyone but the author. Any code that is hard to understand and that can be rewritten in a simpler way that is easier to understand SHOULD be rewritten that way.
Your colleagues are absolutely right.
Is there any reason to do it?
The only possible reason for doing something like that is that you have extensively profiled the application and found this part of code to be a significant bottleneck. Then you have implemented the abomination above, rerun the profiler, and found that it REALLY improves the performance.
Well, I spent some time reading the above without realising what was going on. So I would definitely suggest that it's not ideal. I wouldn't really ever expect the if() statement itself to change state.
I wouldn't recommend an if condition having side-effects without a very good reason. For me, this particular example took several looks to figure out what was going on. There may be a case where it isn't so bad, although I certainly can't think of one.
Ideally, each piece of code should do one thing. Making it do more than one thing is potentially confusing, and confusing is exactly what you don't want in your code.
The code in the condition of an if statement is supposed to generate a boolean value. Tasking it with assigning a value is making it do two things, which is generally bad.
Moreover, people expect conditions to be just conditions, and they often glance over them when they're getting an impression of what the code is doing. They don't carefully parse everything until they decide they need to.
Stick that in code I'm reviewing and I'll flag it as a defect.
You can also get ternary to avoid multiple returns:
int skipFooBarIndex(String[] list) {
return (list.length > 0 && list[0].equals(SKIP_FIRST)) ? 1 :
((list.length > 1 && (list[0] + "/" + list[1]).equals(SKIP_SECOND)) ? 2 : 0);
}
Though this example is less readable.
Speaking as someone who does a lot of maintenance programming: if I came across this I would curse you, weep and then change it.
Code like this is a nightmare - it screams one of two things
I'm new here and I need help doing the right thing.
I think I am very clever because I have saved lines of code or I have fooled the compiler and made it quicker. Its not clever, its not optimal and its not funny
;)
In C it's fairly common to change state inside if statements. Generally speaking, I find that there are a few unwritten rules on where this is acceptable, for example:
You are reading into a variable and checking the result:
int a;
...
if ((a = getchar()) == 'q') { ... }
Incrementing a value and checking the result:
int *a = (int *)0xdeadbeef;
...
if (5 == *(a++)) { ... }
And when it is not acceptable:
You are assigning a constant to a variable:
int a;
...
if (a = 5) { ... } // this is almost always unintentional
Mixing and matching pre- and post-increment, and short-circuiting:
int a = 0, b;
...
if (b || a++) { ... } // BAD!
For some reason the font for sections I'm trying to mark as code is not fixed-width on SO, but in a fixed width font there are situations where assignment inside if expressions is both sensible and clear.