Clean code vs performance - java

Some principles of clean code are:
functions should do one thing at one abstraction level
functions should be at most 20 lines long
functions should never have more than 2 input parameters
How many cpu cycles are "lost" by adding an extra function call in Java?
Are there compiler options available that transform many small functions into one big function in order to optimize performance?
E.g.
void foo() {
bar1()
bar2()
}
void bar1() {
a();
b();
}
void bar2() {
c();
d();
}
Would become
void foo() {
a();
b();
c();
d();
}

How many cpu cycles are "lost" by adding an extra function call in Java?
This depends on whether it is inlined or not. If it's inline it will be nothing (or a notional amount)
If it is not compiled at runtime, it hardly matters because the cost of interperting is more important than a micro optimisation, and it is likely to be not called enough to matter (which is why it wasn't optimised)
The only time it really matters is when the code is called often, however for some reason it is prevented from being optimised. I would only assume this is the case because you have a profiler telling you this is a performance issue, and in this case manual inlining might be the answer.
I designed, develop and optimise latency sensitive code in Java and I choose to manually inline methods much less than 1% of time, but only after a profiler e.g. Flight Recorder suggests there is a significant performance problem.
In the rare event it matters, how much difference does it make?
I would estimate between 0.03 and 0.1 micros-seconds in real applications for each extra call, in a micro-benchmark it would be far less.
Are there compiler options available that transform many small functions into one big function in order to optimize performance?
Yes, in fact what could happen is not only are all these method inlined, but the methods which call them are inlined as well and none of them matter at runtime, but only if the code is called enough to be optimised. i.e. not only is a,b, c and d inlined and their code but foo is inlined as well.
By default the Oracle JVM can line to a depth of 9 levels (until the code gets more than 325 bytes of byte code)
Will clean code help performance
The JVM runtime optimiser has common patterns it optimises for. Clean, simple code is generally easier to optimise and when you try something tricky or not obvious, you can end up being much slower. If it harder to understand for a human, there is a good chance it's hard for the optimiser to understand/optimise.

Runtime behavior and cleanliness of code (a compile time or life time property of code) belong to different requirement categories. There might be cases where optimizing for one category is detrimental to the other.
The question is: which category really needs you attention?
In my view cleanliness of code (or malleability of software) suffers from a huge lack of attention. You should focus on that first. And only if other requirements start to fall behind (e.g. performance) you inquire as to whether that's due to how clean the code is. That means you need to really compare, you need to measure the difference it makes. With regard to performance use a profiler of your choice: run the "dirty" code variant and the clean variant and check the difference. Is it markedly? Only if the "dirty" variant is significantly faster should you lower the cleanliness.

Consider the following piece of code, which compares a code that does 3 things in one for loop to another that has 3 different for loops for each task.
#Test
public void singleLoopVsMultiple() {
for (int j = 0; j < 5; j++) {
//single loop
int x = 0, y = 0, z = 0;
long l = System.currentTimeMillis();
for (int i = 0; i < 100000000; i++) {
x++;
y++;
z++;
}
l = System.currentTimeMillis() - l;
//multiple loops doing the same thing
int a = 0, b = 0, c = 0;
long m = System.currentTimeMillis();
for (int i = 0; i < 100000000; i++) {
a++;
}
for (int i = 0; i < 100000000; i++) {
b++;
}
for (int i = 0; i < 100000000; i++) {
c++;
}
m = System.currentTimeMillis() - m;
System.out.println(String.format("%d,%d", l, m));
}
}
When I run it, here is the output I get for time in milliseconds.
6,5
8,0
0,0
0,0
0,0
After a few runs, JVM is able to identify hotspots of intensive code and optimises parts of the code to make them significantly faster. In our previous example, after 2 runs, the JVM had already optimised the code so much that the discussion around for-loops became redundant.
Unless we know what's happening inside, we cannot predict the performance implications of changes like introduction of for-loops. The only way to actually improve the performance of a system is by measuring it and focusing only on fixing the actual bottlenecks.
There is a chance that cleaning your code may make it faster for the JVM. But even if that is not the case, every performance optimisation, comes with added code complexity. Ask yourself whether the added complexity is worth the future maintenance effort. After all, the most expensive resource on any team is the developer, not the servers, and any additional complexity slows the developer, adding to the project cost.
The way to deal it is to figure out your benchmarks, what kind of application you're making, what are the bottlenecks. If you're making a web-app, perhaps the DB is taking most of the time, and reducing the number of functions will not make a difference. On the other hand, if its an app running on a system where performance is everything, every small thing counts.

Related

Is it a sensible optimization to check whether a variable holds a specific value before writing that value?

if (var != X)
var = X;
Is it sensible or not? Will the compiler always optimize-out the if statement? Are there any use cases that would benefit from the if statement?
What if var is a volatile variable?
I'm interested in both C++ and Java answers as the volatile variables have different semantics in both of the languages. Also the Java's JIT-compiling can make a difference.
The if statement introduces branching and additional read that wouldn't happen if we always overwrote var with X, so it's bad. On the other hand, if var == X then using this optimization we perform only a read and we do not perform a write, which could have some effects on cache. Clearly, there are some trade-offs here. I'd like to know how it looks like in practice. Has anyone done any testing on this?
EDIT:
I'm mostly interested about how it looks like in a multi-processor environment. In a trivial situation there doesn't seem to be much sense in checking the variable first. But when cache coherency has to be kept between processors/cores the extra check might be actually beneficial. I just wonder how big impact can it have? Also shouldn't the processor do such an optimization itself? If var == X assigning it once more value X should not 'dirt-up' the cache. But can we rely on this?
Yes, there are definitely cases where this is sensible, and as you suggest, volatile variables are one of those cases - even for single threaded access!
Volatile writes are expensive, both from a hardware and a compiler/JIT perspective. At the hardware level, these writes might be 10x-100x more expensive than a normal write, since write buffers have to be flushed (on x86, the details will vary by platform). At the compiler/JIT level, volatile writes inhibit many common optimizations.
Speculation, however, can only get you so far - the proof is always in the benchmarking. Here's a microbenchmark that tries your two strategies. The basic idea is to copy values from one array to another (pretty much System.arraycopy), with two variants - one which copies unconditionally, and one that checks to see if the values are different first.
Here are the copy routines for the simple, non-volatile case (full source here):
// no check
for (int i=0; i < ARRAY_LENGTH; i++) {
target[i] = source[i];
}
// check, then set if unequal
for (int i=0; i < ARRAY_LENGTH; i++) {
int x = source[i];
if (target[i] != x) {
target[i] = x;
}
}
The results using the above code to copy an array length of 1000, using Caliper as my microbenchmark harness, are:
benchmark arrayType ns linear runtime
CopyNoCheck SAME 470 =
CopyNoCheck DIFFERENT 460 =
CopyCheck SAME 1378 ===
CopyCheck DIFFERENT 1856 ====
This also includes about 150ns of overhead per run to reset the target array each time. Skipping the check is much faster - about 0.47 ns per element (or around 0.32 ns per element after we remove the setup overhead, so pretty much exactly 1 cycle on my box).
Checking is about 3x slower when the arrays are the same, and 4x slower then they are different. I'm surprised at how bad the check is, given that it is perfectly predicted. I suspect that the culprit is largely the JIT - with a much more complex loop body, it may be unrolled fewer times, and other optimizations may not apply.
Let's switch to the volatile case. Here, I've used AtomicIntegerArray as my arrays of volatile elements, since Java doesn't have any native array types with volatile elements. Internally, this class is just writing straight through to the array using sun.misc.Unsafe, which allows volatile writes. The assembly generated is substantially similar to normal array access, other than the volatile aspect (and possibly range check elimination, which may not be effective in the AIA case).
Here's the code:
// no check
for (int i=0; i < ARRAY_LENGTH; i++) {
target.set(i, source[i]);
}
// check, then set if unequal
for (int i=0; i < ARRAY_LENGTH; i++) {
int x = source[i];
if (target.get(i) != x) {
target.set(i, x);
}
}
And here are the results:
arrayType benchmark us linear runtime
SAME CopyCheckAI 2.85 =======
SAME CopyNoCheckAI 10.21 ===========================
DIFFERENT CopyCheckAI 11.33 ==============================
DIFFERENT CopyNoCheckAI 11.19 =============================
The tables have turned. Checking first is ~3.5x faster than the usual method. Everything is much slower overall - in the check case, we are paying ~3 ns per loop, and in the worst cases ~10 ns (the times above are in us, and cover the copy of the whole 1000 element array). Volatile writes really are more expensive. There is about 1 ns of overheaded included in the DIFFERENT case to reset the array on each iteration (which is why even the simple is slightly slower for DIFFERENT). I suspect a lot of the overhead in the "check" case is actually bounds checking.
This is all single threaded. If you actual had cross-core contention over a volatile, the results would be much, much worse for the simple method, and just about as good as the above for the check case (the cache line would just sit in the shared state - no coherency traffic needed).
I've also only tested the extremes of "every element equal" vs "every element different". This means the branch in the "check" algorithm is always perfectly predicted. If you had a mix of equal and different, you wouldn't get just a weighted combination of the times for the SAME and DIFFERENT cases - you do worse, due to misprediction (both at the hardware level, and perhaps also at the JIT level, which can no longer optimize for the always-taken branch).
So whether it is sensible, even for volatile, depends on the specific context - the mix of equal and unequal values, the surrounding code and so on. I'd usually not do it for volatile alone in a single-threaded scenario, unless I suspected a large number of sets are redundant. In heavily multi-threaded structures, however, reading and then doing a volatile write (or other expensive operation, like a CAS) is a best-practice and you'll see it quality code such as java.util.concurrent structures.
Is it a sensible optimization to check whether a variable holds a specific value before writing that value?
Are there any use cases that would benefit from the if statement?
It is when assignment is significantly more costly than an inequality comparison returning false.
A example would be a large* std::set, which may require many heap allocations to duplicate.
**for some definition of "large"*
Will the compiler always optimize-out the if statement?
That's a fairly safe "no", as are most questions that contain both "optimize" and "always".
The C++ standard makes rare mention of optimizations, but never demands one.
What if var is a volatile variable?
Then it may perform the if, although volatile doesn't achieve what most people assume.
In general the answer is no. Since if you have simple datatype, compiler would be able to perform any necessary optimizations. And in case of types with heavy operator= it is responsibility of operator= to choose optimal way to assign new value.
There are situations where even a trivial assignment of say a pointersized variable can be more expensive than a read and branch (especially if predictable).
Why? Multithreading. If several threads are only reading the same value, they can all share that value in their caches. But as soon as you write to it, you have to invalidate the cacheline and get the new value the next time you want to read it or you have to get the updated value to keep your cache coherent. Both situations lead to more traffic between the cores and add latency to the reads.
If the branch is pretty unpredictable though it's probably still slower.
In C++, assigning a SIMPLE variable (that is, a normal integer or float variable) is definitely and always faster than checking if it already has that value and then setting it if it didn't have the value. I would be very surprised if this wasn't true in Java too, but I don't know how complicated or simple things are in Java - I've written a few hundred lines, and not actually studied how byte code and JITed bytecode actually works.
Clearly, if the variable is very easy to check, but complicated to set, which could be the case for classes and other such things, then there may be a value. The typical case where you'd find this would be in some code where the "value" is some sort of index or hash, but if it's not a match, a whole lot of work is required. One example would be in a task-switch:
if (current_process != new_process_to_run)
current_process == new_process_to_run;
Because here, a "process" is a complex object to alter, but the != can be done on the ID of the process.
Whether the object is simple or complex, the compiler will almost certainly not understand what you are trying to do here, so it will probably not optimize it away - but compilers are more clever than you think SOMETIMES, and more stupid at other times, so I wouldn't bet either way.
volatile should always force the compiler to read and write values to the variable, whether it "thinks" it is necessary or not, so it will definitely READ the variable and WRITE the variable. Of course, if the variable is volatile it probably means that it can change or represents some hardware, so you should be EXTRA careful with how you treat it yourself too... An extra read of a PCI-X card could incur several bus cycles (bus cycles being an order of magnitude slower than the processor speed!), which is likely to affect the performance much more. But then writing to a hardware register may (for example) cause the hardware to do something unexpected, and checking that we have that value first MAY make it faster, because "some operation starts over", or something like that.
It would be sensible if you had read-write locking semantics involved, whenever reading is usually less disruptive than writing.
In Objective-C you have the situation where assigning a object address to a pointer variable may require that the object be "retained" (reference count incremented). In such a case it makes sense to see if the value being assigned is the same as the value currently in the pointer variable, to avoid having to do the relatively expensive increment/decrement operations.
Other languages that use reference counting likely have similar scenarios.
But when assigning, say, an int or a boolean to a simple variable (outside of the multiprocessor cache scenario mentioned elsewhere) the test is rarely merited. The speed of a store in most processors is at least as fast as the load/test/branch.
In java the answer is always no. All assignments you can do in Java are primitive. In C++, the answer is still pretty much always no - if copying is so much more expensive than an equality check, the class in question should do that equality check itself.

Measuring time interval and out of order execution

I've been reading about Java memory model and I'm aware that it's possible for compiler to reorganize statements to optimize code.
Suppose I had a the following code:
long tick = System.nanoTime();
function_or_block_whose_time_i_intend_to_measure();
long tock = System.nanoTime();
would the compiler ever reorganize the code in a way that what I intend to measure is not executed between tick and tock? For example,
long tick = System.nanoTime();
long tock = System.nanoTime();
function_or_block_whose_time_i_intend_to_measure();
If so, what's the right way to preserve execution order?
EDIT:
Example illustrating out-of-order execution with nanoTime :
public class Foo {
public static void main(String[] args) {
while (true) {
long x = 0;
long tick = System.nanoTime();
for (int i = 0; i < 10000; i++) { // This for block takes ~15sec on my machine
for (int j = 0; j < 600000; j++) {
x = x + x * x;
}
}
long tock = System.nanoTime();
System.out.println("time=" + (tock - tick));
x = 0;
}
}
}
Output of above code:
time=3185600
time=16176066510
time=16072426522
time=16297989268
time=16063363358
time=16101897865
time=16133391254
time=16170513289
time=16249963612
time=16263027561
time=16239506975
In the above example, the time measured in first iteration is significantly lower than the measured time in the subsequent runs. I thought this was due to out of order execution. What am I doing wrong with the first iteration?
would the compiler ever reorganize the code in a way that what I intend to measure is not executed between tick and tock?
Nope. That would never happen. If that compiler optimization ever messed up, it would be a very serious bug. Quoting a statement from wiki.
The runtime (which, in this case, usually refers to the dynamic compiler, the processor and the memory subsystem) is free to introduce any useful execution optimizations as long as the result of the thread in isolation is guaranteed to be exactly the same as it would have been had all the statements been executed in the order the statements occurred in the program (also called program order).
So the optimization may be done as long as the result is the same as when executed in program order. In the case that you cited I would assume the optimization is local and that there are no other threads that would be interested in this data. These optimizations are done to reduce the number of trips made to main memory which can be costly. You will only run into trouble with these optimizations when multiple threads are involved and they need to know each other's state.
Now if 2 threads need to see each other's state consistently, they can use volatile variables or a memory barrier (synchronized) to force serialization of writes / reads to main memory. Infoq ran a nice article on this that might interest you.
Java Memory Model (JMM) defines a partial ordering called happens-before on all actions with the program. There are seven rules defined to ensure happens-before ordering. One of them is called Program order rule:
Program order rule. Each action in a thread happens-before every action in that thread that comes later in the program order.
According to this rule, your code will not be re-ordered by the compiler.
The book Java Concurrency in Practice gives an excellent explanation on this topic.

Java compiler optimization for repeated method calls?

Does the java compiler (the default javac that comes in JDK1.6.0_21) optimize code to prevent the same method from being called with the same arguments over and over? If I wrote this code:
public class FooBar {
public static void main(String[] args) {
foo(bar);
foo(bar);
foo(bar);
}
}
Would the method foo(bar) only run once? If so, is there any way to prevent this optimization? (I'm trying to compare runtime for two algos, one iterative and one comparative, and I want to call them a bunch of times to get a representative sample)
Any insight would be much appreciated; I took this problem to the point of insanity (I though my computer was insanely fast for a little while, so I kept on adding method calls until I got the code too large error at 43671 lines).
The optimization you are observing is probably nothing to do with repeated calls ... because that would be an invalid optimization. More likely, the optimizer has figured out that the method calls have no observable effect on the computation.
The cure is to change the method so that it does affect the result of computation ...
It doesn't; that would cause a big problem if foo is non-pure (changes the global state of the program). For example:
public class FooBar {
private int i = 0;
private static int foo() {
return ++i;
}
public static void main(String[] args) {
foo();
foo();
foo();
System.out.println(i);
}
}
You haven't provided enough information to allow for any definitive answers, but the jvm runtime optimizer is extremely powerful and does all sorts of inlining, runtime dataflow and escape analysis, and all manner of cache tricks.
The end result is to make the sort of micro-benchmarks you are trying to perform all but useless in practice; and extremely difficult to get right even when they are potentially useful.
Definitely read http://www.ibm.com/developerworks/java/library/j-benchmark1.html for a fuller discussion on the problems you face. At the very least you need to ensure:
foo is called in a loop that runs thousands of times
foo() returns a result, and
that result is used
The following is the minimum starting point, assuming foo() is non-trivial and therefore is unlikely to be inlined. Note: You still have to expect loop-unrolling and other cache level optimizations. Also watch out for the hotspot compile breakpoint (I believe this is ~5000 calls on -server IIRC), which can completely stuff up your measurements if you try to re-run the measurements in the same JVM.
public class FooBar {
public static void main(String[] args) {
int sum = 0;
int ITERATIONS = 10000;
for (int i = 0; i < ITERATIONS; i++) {
sum += foo(i);
}
System.out.println("%d iterations returned %d sum", ITERATIONS, sum);
}
}
Seriously, you need to do some reading before you can make any meaningful progress towards writing benchmarks on a modern JVM. The same optimizations that allows modern Java code to match or even sometimes beat C++ make benchmarking really difficult.
The Java compiler is not allowed to perform such optimizations because method calls very likely cause side effets, for example IO actions or changes to all fields it can reach, or calling other methods that do so.
In functional languages where each function call is guaranteed to return the same result if called with the same arguments (changes to state are forbidden), a compiler might indeed optimize away multiple calls by memorizing the result.
If you feel your algorithms are too fast, try to give them some large or complicated problem sets. There are only a few algorithms which are always quite fast.

Java Performance measurement

I am doing some Java performance comparison between my classes, and wondering if there is some sort of Java Performance Framework to make writing performance measurement code easier?
I.e, what I am doing now is trying to measure what effect does it have having a method as "synchronized" as in PseudoRandomUsingSynch.nextInt() compared to using an AtomicInteger as my "synchronizer".
So I am trying to measure how long it takes to generate random integers using 3 threads accessing a synchronized method looping for say 10000 times.
I am sure there is a much better way doing this. Can you please enlighten me? :)
public static void main( String [] args ) throws InterruptedException, ExecutionException {
PseudoRandomUsingSynch rand1 = new PseudoRandomUsingSynch((int)System.currentTimeMillis());
int n = 3;
ExecutorService execService = Executors.newFixedThreadPool(n);
long timeBefore = System.currentTimeMillis();
for(int idx=0; idx<100000; ++idx) {
Future<Integer> future = execService.submit(rand1);
Future<Integer> future1 = execService.submit(rand1);
Future<Integer> future2 = execService.submit(rand1);
int random1 = future.get();
int random2 = future1.get();
int random3 = future2.get();
}
long timeAfter = System.currentTimeMillis();
long elapsed = timeAfter - timeBefore;
out.println("elapsed:" + elapsed);
}
the class
public class PseudoRandomUsingSynch implements Callable<Integer> {
private int seed;
public PseudoRandomUsingSynch(int s) { seed = s; }
public synchronized int nextInt(int n) {
byte [] s = DonsUtil.intToByteArray(seed);
SecureRandom secureRandom = new SecureRandom(s);
return ( secureRandom.nextInt() % n );
}
#Override
public Integer call() throws Exception {
return nextInt((int)System.currentTimeMillis());
}
}
Regards
Ignoring the question of whether a microbenchmark is useful in your case (Stephen C' s points are very valid), I would point out:
Firstly, don't listen to people who say 'it's not that hard'. Yes, microbenchmarking on a virtual machine with JIT compilation is difficult. It's actually really difficult to get meaningful and useful figures out of a microbenchmark, and anyone who claims it's not hard is either a supergenius or doing it wrong. :)
Secondly, yes, there are a few such frameworks around. One worth looking at (thought it's in very early pre-release stage) is Caliper, by Kevin Bourrillion and Jesse Wilson of Google. Looks really impressive from a few early looks at it.
More micro-benchmarking advice - micro benchmarks rarely tell you what you really need to know ... which is how fast a real application is going to run.
In your case, I imagine you are trying to figure out if your application will perform better using an Atomic object than using synchronized ... or vice versa. And the real answer is that it most likely depends on factors that a micro-benchmark cannot measure. Things like the probability of contention, how long locks are held, the number of threads and processors, and the amount of extra algorithmic work needed to make atomic update a viable solution.
EDIT - in response to this question.
so is there a way i can measure all these probability of contention, locks held duration, etc ?
In theory yes. Once you have implemented the entire application, it is possible to instrument it to measure these things. But that doesn't give you your answer either, because there isn't a predictive model you can plug these numbers into to give the answer. And besides, you've already implemented the application by then.
But my point was not that measuring these factors allows you to predict performance. (It doesn't!) Rather, it was that a micro-benchmark does not allow you to predict performance either.
In reality, the best approach is to implement the application according to your intuition, and then use profiling as the basis for figuring out where the real performance problems are.
OpenJDK guys have developed a benchmarking tool called JMH:
http://openjdk.java.net/projects/code-tools/jmh/
This provides quite an easy to setup framework, and there is a couple of samples showing how to use that.
http://hg.openjdk.java.net/code-tools/jmh/file/tip/jmh-samples/src/main/java/org/openjdk/jmh/samples/
Nothing can prevent you from writing the benchmark wrong, but they did a great job at eliminating the non-obvious mistakes (such as false sharing between threads, preventing dead code elimination etc).
These guys designed a good JVM measurement methodology so you won't fool yourself with bogus numbers, and then published it as a Python script so you can re-use their smarts -
Statistically Rigorous Java Performance Evaluation (pdf paper)
You probably want to move the loop into the task. As it is you just start all the threads and almost immediately you're back to single threaded.
Usual microbenchmarking advice: Allow for some warm up. As well as average, deviation is interesting. Use System.nanoTime instead of System.currentTimeMillis.
Specific to this problem is how much the threads fight. With a large number of contending threads, cas loops can perform wasted work. Creating a SecureRandom is probably expensive, and so might System.currentTimeMillis to a lesser extent. I believe SecureRandom should already be thread safe, if used correctly.
In short, you are thus searching for an "Java unit performance testing tool"?
Use JUnitPerf.
Update: for the case it's not clear yet: it also supports concurrent (multithreading) testing. Here's an extract of the chapter "LoadTest" of the aforementioned link which includes a code sample:
For example, to create a load test of
10 concurrent users with each user
running the
ExampleTestCase.testOneSecondResponse()
method for 20 iterations, and with a 1
second delay between the addition of
users, use:
int users = 10;
int iterations = 20;
Timer timer = new ConstantTimer(1000);
Test testCase = new ExampleTestCase("testOneSecondResponse");
Test repeatedTest = new RepeatedTest(testCase, iterations);
Test loadTest = new LoadTest(repeatedTest, users, timer);

Optimizing the creation of objects inside loops

Which of the following would be more optimal on a Java 6 HotSpot VM?
final Map<Foo,Bar> map = new HashMap<Foo,Bar>(someNotSoLargeNumber);
for (int i = 0; i < someLargeNumber; i++)
{
doSomethingWithMap(map);
map.clear();
}
or
final int someNotSoLargeNumber = ...;
for (int i = 0; i < someLargeNumber; i++)
{
final Map<Foo,Bar> map = new HashMap<Foo,Bar>(someNotSoLargeNumber);
doSomethingWithMap(map);
}
I think they're both as clear to the intent, so I don't think style/added complexity is an issue here.
Intuitively it looks like the first one would be better as there's only one 'new'. However, given that no reference to the map is held onto, would HotSpot be able to determine that a map of the same size (Entry[someNotSoLargeNumber] internally) is being created for each loop and then use the same block of memory (i.e. not do a lot of memory allocation, just zeroing that might be quicker than calling clear() for each loop)?
An acceptable answer would be a link to a document describing the different types of optimisations the HotSpot VM can actually do, and how to write code to assist HotSpot (rather than naive attmepts at optimising the code by hand).
Don't spend your time on such micro optimizations unless your profiler says you should do it. In particular, Sun claims that modern garbage collectors do very well with short-living objects and new() becomes cheaper and cheaper
Garbage collection and performance on DeveloperWorks
That's a pretty tight loop over a "fairly large number", so generally I would say move the instantiation outside of the loop. But, overall, my guess is you aren't going to notice much of a difference as I am willing to bet that your doSomethingWithMap will take up the majority of time to allow the GC to catch up.

Categories

Resources