Is extracting to static final necessary for Java optimization?

Is extracting to static final necessary for Java optimization? - java

Consider this method:
private void iterate(List<Worker> workers) {
SortedSet<Worker> set = new TreeSet<>(new Comparator<Worker>() {
#Override
public int compare(Worker w0, Worker w1) {
return Double.compare(w0.average, w1.average);
}
});
// ...
}
As you can see the set is creating a new TreeSet with a custom comparator.
I was wondering if it makes any difference from a performance/memory/garbage collection/whatever point of view, if I were to do this and instead having polluted the outer space:
static final Comparator<Worker> COMPARATOR = new Comparator<Worker>() {
#Override
public int compare(Worker w0, Worker w1) {
return Double.compare(w0.average, w1.average);
}
};
private void iterate(List<Worker> workers) {
SortedSet<Worker> set = new TreeSet<>(COMPARATOR);
// ...
}
The reason I am asking, is that I feel the compiler should already figure this out and optimize it for me, so I shouldn't have to extract it, right?
The same thing goes for Strings or other temporary, immutable objects declared within a method.
Would it make any difference to extract it a final variable?
Note: I am aware of the little impact of performance boost this might give. The question is whether there is any difference, howsoever negligible.

There will be a difference yes.
CPU affect: assigning to a static reduces the amount of work necessary to allocate a new comparator every time
GC effect: assigning a new object every time and then immediately discarding it will have no young GC cost; however assigning it to a variable will increase GC times (very very marginally) as it is an extra set of references that will need to be walked. Dead objects cost nothing, live objects do.
Memory effect: assigning the comparator to a constant will reduce how much memory each invocation of the method will require, in exchange for a low constant overhead that will be moved into tenured GC space.
Risk of reference escapes: Inner classes contain pointers to the class that constructed it. If the inner class (Comparator) was ever returned out of the method that created it, then a strong reference to the parent object could escape and prevent GC of the parent. Purely a gotcha that can creep into code, it is not a problem in this example.
Hotspot is very good at inlining but it is unlikely to recognise that the comparator can be allocated on heap or moved to a constant. But that will depend on the contents of TreeSet. If the implementation of TreeSet was very simple (and small) then it could get inlined, however we all know that it is not. Also TreeSet is coded to be generic, if it was only ever used with one type of object (Worker) then there are some optimisations that the JVM can apply however we should assume that TreeSet will get used by other types too and so TreeSet will not be able to make any assumptions about the Comparator that is being past into it.
Thus the difference between the two versions is primarily an object allocation. The use of the final keyword is unlikely to improve performance as Hotspot mostly ignores the final keyword anyway.
Java 8 has a very interesting behaviour here when using lambdas. Consider the following variant of your example:
import java.util.*;
public class T {
public void iterate(List<String> workers) {
SortedSet<Double> set = new TreeSet<>( Double::compare );
}
}
Run 'javap -c T.class', and you will see the following jvm code:
public void iterate(java.util.List<java.lang.String>);
Code:
0: new #2 // class java/util/TreeSet
3: dup
4: invokedynamic #3, 0 // InvokeDynamic #0:compare:()Ljava/util/Comparator;
9: invokespecial #4 // Method java/util/TreeSet."<init>":(Ljava/util/Comparator;)V
12: astore_2
13: return
The cool thing to note here is that there is no object construction for the lambda. invokedynamic will have a higher cost the first time that it is called and then it gets effectively cached.

One big difference is caused by the "hidden" implication that anonymous classes hold an implicit reference to the containing class, so if you pass that TreeSet to another process, a reference to your class instance is held via the TreeSet via the anonymous Comparator by another piece of code, so your instance ca't be garbage collected.
That can cause a memory leak.
Option 2 however doesn't suffer from that problem.
Otherwise, it's a matter of style.
In java 8, you can use a lambda expression instead, which is the best of both worlds.

Related

When can Hotspot allocate objects on the stack? [duplicate]

This question already has answers here:
Eligibility for escape analysis / stack allocation with Java 7
(3 answers)
Closed 5 years ago.
Since somewhere around Java 6, the Hotspot JVM can do escape analysis and allocate non-escaping objects on the stack instead of on the garbage collected heap. This results in a speedup of the generated code and reduces pressure on the garbage collector.
What are the rules for when Hotspot is able to stack allocate objects? In other words when can I rely on it to do stack allocation?
edit: This question is a duplicate, but (IMO) the answer below is a better answer than what is available at the original question.

I have done some experimentation in order to see when Hotspot is able to stack allocate. It turns out that its stack allocation is quite a bit more limited than what you might expect based on the available documentation. The referenced paper by Choi "Escape Analysis for Java" suggests that an object that is only ever assigned to local variables can always be stack allocated. But that is not true.
All of this are implementation details of the current Hotspot implementation, so they could change in future versions. This refers to my OpenJDK install which is version 1.8.0_121 for X86-64.
The short summary, based on quite a bit of experimentation, seems to be:
Hotspot can stack-allocate an object instance if
all its uses are inlined
it is never assigned to any static or object fields, only to local variables
at each point in the program, which local variables contain references to the object must be JIT-time determinable, and not depend on any unpredictable conditional control flow.
If the object is an array, its size must be known at JIT time and indexing into it must use JIT-time constants.
To know when these conditions hold you need to know quite a bit about how Hotspot works. Relying on Hotspot to definately do stack allocation in a certain situation can be risky, as a lot of non-local factors are involved. Especially knowing if everything is inlined can be difficult to predict.
Practically speaking, simple iterators will usually be stack allocatable if you just use them to iterate. For composite objects only the outer object can ever be stack allocated, so lists and other collections always cause heap allocation.
If you have a HashMap<Integer,Something> and you use it in myHashMap.get(42), the 42 may stack allocate in a test program, but it will not in a full application because you can be sure that there will be more than two types of key objects in HashMaps in the entire program, and therefore the hashCode and equals methods on the key won't inline.
Beyond that I don't see any generally applicable rules, and it will depend on the specifics of the code.
Hotspot internals
The first important thing to know is that escape analysis is performed after inlining. This means that Hotspot's escape analysis is in this respect more powerful than the description in the Choi paper, since an object returned from a method but local to the caller method can still be stack allocated. Because of this iterators can nearly always be stack allocated if you do e.g. for(Foo item : myList) {...} (and the implementation of myList.iterator() is simple enough, which they usually are.)
Hotspot only compiles optimized versions of methods once it determines the method is 'hot', so code that is not run a lot of times does not get optimized at all, in which case there is no stack allocation or inlining whatsoever. But for those methods you usually don't care.
Inlining
Inlining decisions are based on profiling data that Hotspot collects first. The declared types do not matter so much, even if a method is virtual Hotspot can inline it based on the types of the objects it sees during profiling. Something similar holds for branches (i.e. if-statements and other control flow constructs): If during profiling Hotspot never sees a certain branch being taken, it will compile and optimize the code based on the assumption that the branch is never taken. In both cases, if Hotspot cannot prove that its assumptions will always be true, it will insert checks in the compiled code known as 'uncommon traps', and if such a trap is hit Hotspot will de-optimize and possibly re-optimize taking the new information into account.
Hotspot will profile which object types occur as receivers at which call sites. If Hotspot only sees a single type or only two distinct types occuring at a call site, it is able to inline the called method. If there are only one or two very common types and other types occur much less often Hotspot should also still be able to inline the methods of the common types, including a check for which code it needs to take. (I'm not entirely sure about this last case with one or two common types and more uncommon types though). If there are more than two common types, Hotspot will not inline the call at all but instead generate machine code for an indirect call.
'Type' here refers to the exact type of an object. Implemented interfaces or shared superclasses are not taken into account. Even if different receiver types occur at a call site but they all inherit the same implementation of a method (e.g. multiple classes that all inherit hashCode from Object), Hotspot will still generate an indirect call and not inline. (So i.m.o. hotspot is quite stupid in such cases. I hope future versions improve this.)
Hotspot will also only inline methods that are not too big. 'Not too big' is determined by the -XX:MaxInlineSize=n and -XX:FreqInlineSize=n options. Inlinable methods with a JVM bytecode size below MaxInlineSize are always inlined, methods with a JVM bytecode size below FreqInlineSize are inlined if the call is 'hot'. Larger methods are never inlined. By default MaxInlineSize is 35 and FreqInlineSize is platform dependent but for me it is 325. So make sure your methods are not too big if you want them inlined. It can sometimes help to split out the common path from a large method, so that it can be inlined into its callers.
Profiling
One important thing to know about profiling is that profiling sites are based on the JVM bytecode, which itself is not inlined in any way. So if you have e.g. a static method
static <T,U> List<U> map(List<T> list, Function<T,U> func) {
List<U> result = new ArrayList();
for(T item : list) { result.add(func.call(item)); }
return result;
}
that maps a SAM Function callable over a list and returns the transformed list, Hotspot will treat the call to func.call as a single program-wide call site. You might call this map function at several spots in your program, passing a different func in at each call site (but the same one for one call site). In that case you might expect that Hotspot is able to inline map, and then also the call to func.call since at every use of map there is only a single func type. If this were so, Hotspot would be able to optimize the loop down very tightly. Unfortunately Hotspot is not smart enough for that. It only keeps a single profile for the func.call call site, lumping all the func types that you pass to map together. You will probably use more than two different implementations of func, so Hotspot will not be able to inline the call to func.call. Link for more details, and archived link as the original appears to be gone.
(As an aside, in Kotlin the equivalent loop can be fully inlined as the Kotlin compiler can do inlining of calls at the bytecode level. So for some uses it could be significantly faster than Java.)
Scalar Replacement
Another important thing to know is that Hotspot does not actually implement stack allocation of objects. Instead it implements scalar replacement, which means that an object is deconstructed into its constituent fields and those fields are stack allocated like normal local variables. This means that there is no object left at all. Scalar replacement only works if there is never a need to create a pointer to the stack-allocated object. Some forms of stack allocation in e.g. C++ or Go would be able to allocate full objects on the stack and then pass references or pointers to them to called functions, but in Hotspot this does not work. Therefore if there is ever a need to pass an object reference to a non-inlined method, even if the reference would not escape the called method, Hotspot will always heap-allocate such an object.
In principle Hotspot could be smarter about this, but right now it is not.
Test program
I used the following program and variations to see when Hotspot will do scalar replacement.
// Minimal example for which the JVM does not scalarize the allocation. If field is final, or the second allocation is unconditional, it will.
class Scalarization {
int field = 0xbd;
long foo(long i) { return i * field; }
public static void main(String[] args) {
long result = 0;
for(long i=0; i<100; i++) {
result += test();
}
System.out.println("Result: "+result);
}
static long test() {
long ctr = 0x5;
for(long i=0; i<0x10000; i++) {
Scalarization s = new Scalarization();
ctr = s.foo(ctr);
if(i == 0) s = new Scalarization();
ctr = s.foo(ctr);
}
return ctr;
}
}
If you compile and run this program with javac Scalarization.java; java -verbose:gc Scalarization you can see if scalar replacement worked by the number of garbage collections. If scalar replacement works, no garbage collection happened on my system, if scalar replacement did not work I see a few garbage collections.
Variants that Hotspot is able to scalarize run significantly faster than versions where it does not. I verified the generated machine code (instructions) to make sure Hotspot was not doing any unexpected optimizations. If hotspot is able to scalar replace the allocations, it can then also do some additional optimizations on the loop, unrolling it a few iterations and then combining those iterations together. So in the scalarized versions the effective loop count is lower with each iteraton doing the work of multiple source code level iterations. So the speed difference is not only due to allocation and garbage collection overhead.
Observations
I tried a number of variations on the above program. One condition for scalar replacement is that the object must never be assigned to an object (or static) field, and presumably also not into an array. So in code like
Foo f = new Foo();
bar.field = f;
the Foo object cannot be scalar replaced. This holds even if bar itself is scalar replaced, and also if you never again use bar.field. So an object can only ever be assigned to local variables.
That alone is not enough, Hotspot must also be able to determine statically at JIT-time which object instance will be the target of a call. For example, using the following implementations of foo and test and removing field causes heap allocation:
long foo(long i) { return i * 0xbb; }
static long test() {
long ctr = 0x5;
for(long i=0; i<0x10000; i++) {
Scalarization s = new Scalarization();
ctr = s.foo(ctr);
if(i == 50) s = new Scalarization();
ctr = s.foo(ctr);
}
return ctr;
}
While if you then remove the conditional for the second assignment no more heap allocation occurs:
static long test() {
long ctr = 0x5;
for(long i=0; i<0x10000; i++) {
Scalarization s = new Scalarization();
ctr = s.foo(ctr);
s = new Scalarization();
ctr = s.foo(ctr);
}
return ctr;
}
In this case Hotspot can determine statically which instance is the target for each call to s.foo.
On the other hand, even if the second assignment to s is a subclass of Scalarization with a completely different implementation, as long as the assignment is unconditional Hotspot will still scalarize the allocations.
Hotspot does not appear to be able to move an object to the heap that was previously scalar replaced (at least not without deoptimizing). Scalar replacement is an all-or-nothing affair. So in the original test method both allocations of Scalarization always happen on the heap.
Conditionals
One important detail is that Hotspot will predict conditionals based on its profiling data. If a conditional assignment is never executed, Hotspot will compile code under that assumption, and then might be able to do scalar replacement. If at a later point in time the condtion does get taken, Hotspot will need to recompile the code with this new assumption. The new code will not do scalar replacement since Hotspot can no longer determine the receiver instance of following calls statically.
For instance in this variant of test:
static long limit = 0;
static long test() {
long ctr = 0x5;
long i = limit;
limit += 0x10000;
for(; i<limit; i++) { // In this form if scalarization happens is nondeterministic: if the condition is hit before profiling starts scalarization happens, else not.
Scalarization s = new Scalarization();
ctr = s.foo(ctr);
if(i == 0xf9a0) s = new Scalarization();
ctr = s.foo(ctr);
}
return ctr;
}
the conditional assignemnt is only executed once during the lifetime of the program. If this assignment occurs early enough, before Hotspot starts full profiling of the test method, Hotspot never notices the conditional being taken and compiles code that does scalar replacement. If profiling has already started when the conditional is taken, Hotspot will not do scalar replacement. With the test value of 0xf9a0, whether scalar replacement happens is nondeterministic on my computer, since exactly when profiling starts can vary (e.g. because profiling and optimized code is compiled on background threads). So if I run the above variant it sometimes does a few garbage collections, and sometimes does not.
Hotspot's static code analysis is much more limited than what C/C++ and other static compilers can do, so Hotspot is not as smart in following the control flow in a method through several conditionals and other control structures to determine the instance that a variable refers to, even if it would be statically determinable for the programmer or a smarter compiler. In many cases the profiling information will make up for that, but it is something to be aware of.
Arrays
Arrays can be stack allocated if their size is known at JIT time. However indexing into an array is not supported unless Hotspot can also statically determine the index value at JIT-time. So stack allocated arrays are pretty useless. Since most programs don't use arrays directly but use the standard collections this is not very relevant, as embedded objects such as the array containing the data within an ArrayList already need to be heap-allocated due to their embedded-ness. I suppose the reasoning for this restriction is that there exists no indexing operation on local variables so this would require additional code generation functionality for a pretty rare use case.

May the removal of an unused field cause a garbage collection?

For a library that involves asynchronous operations, I have to keep a reference to an object alive until a certain condition is met.
(I know, that sounds unusual. So here is some context, although it may not strictly be relevant: The object may be considered to be a direct ByteBuffer which is used in JNI operations. The JNI operations will fetch the address of the buffer. At this point, this address is only a "pointer" that is not considered as a reference to the byte buffer. The address may be used asynchronously, later in time. Thus, the buffer has to be prevented from being garbage collected until the JNI operation is finished.)
To achieve this, I implemented a method that is basically equivalent to this:
private static void keepReference(final Object object)
{
Runnable runnable = new Runnable()
{
#SuppressWarnings("unused")
private Object localObject = object;
public void run()
{
// Do something that does NOT involve the "localObject" ...
waitUntilCertainCondition();
// When this is done, the localObject may be garbage collected
}
};
someExecutor.execute(runnable);
}
The idea is to create a Runnable instance that has the required object as a field, throw this runnable into an executor, and let the runnable wait until the condition is met. The executor will keep a reference to the runnable instance until it is finshed. The runnable is supposed to keep a reference to the required object. So only after the condition is met, the runnable will be released by the executor, and thus, the local object will become eligible for garbage collection.
The localObject field is not used in the body of the run() method. May the compiler (or more precisely: the runtime) detect this, and decide to remove this unused reference, and thus allow the object to be garbage collected too early?
(I considered workarounds for this. For example, using the object in a "dummy statement" like logger.log(FINEST, localObject);. But even then, one could not be sure that a "smart" optimizer wouldn't do some inlining and still detect that the object is not really used)
Update: As pointed out in the comments: Whether this can work at all might depend on the exact Executor implementation (although I'd have to analyze this more carefully). In the given case, the executor will be a ThreadPoolExecutor.
This may be one step towards the answer:
The ThreadPoolExecutor has an afterExecute method. One could override this method and then use a sledgehammer of reflection to dive into the Runnable instance that is given there as an argument. Now, one could simply use reflection hacks to walk to this reference, and use runnable.getClass().getDeclaredFields() to fetch the fields (namely, the localObject field), and then fetch the value of this field. And I think that it should not be allowed to observe a value there that is different from the one that it originally had.
Another comment pointed out that the default implementation of afterExecute is empty, but I'm not sure whether this fact can affect the question of whether the field may be removed or not.
Right now, I strongly assume that the field may not be removed. But some definite reference (or at least more convincing arguments) would be nice.
Update 2: Based on the comments and the answer by Holger, I think that not the removal of "the field itself" may be a problem, but rather the GC of the surrounding Runnable instance. So right now, I assume that one could try something like this:
private static long dummyCounter = 0;
private static Executor executor = new ThreadPoolExecutor(...) {
#Override
public void afterExecute(Runnable r, Throwable t) {
if (r != null) dummyCounter++;
if (dummyCounter == Long.MAX_VALUE) {
System.out.println("This will never happen", r);
}
}
}
to make sure that the localObject in the runnable really lives as long as it should. But I can hardly remember ever having been forced to write something that screamed "crude hack" as loud as these few lines of code...

If JNI code fetches the address of a direct buffer, it should be the responsibility of the JNI code itself, to hold a reference to the direct buffer object as long as the JNI code holds the pointer, e.g. using NewGlobalRef and DeleteGlobalRef.
Regarding your specific question, this is addressed directly in JLS §12.6.1. Implementing Finalization:
Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. …
Another example of this occurs if the values in an object's fields are stored in registers. … Note that this sort of optimization is only allowed if references are on the stack, not stored in the heap.
(the last sentence matters)
It is illustrated in that chapter by an example not too different to yours. To make things short, the localObject reference within the Runnable instance will keep the life time of the referenced object at least as long as the life time of the Runnable instance.
That said, the critical point here is the actual life time of the Runnable instance. It will be considered definitely alive, i.e. immune to optimizations, due to the rule specified above, if it is also referred by an object that is immune to optimizations, but even an Executor isn’t necessarily a globally visible object.
That said, method inlining is one of the simplest optimizations, after which a JVM would detect that the afterExecute of a ThreadPoolExecutor is a no-op. By the way, the Runnable passed to it is the Runnable passed to execute, but it wouldn’t be the same as passed to submit, if you use that method, as (only) in the latter case, it’s wrapped in a RunnableFuture.
Note that even the ongoing execution of the run() method does not prevent the collection of the Runnable implementation’s instance, as illustrated in “finalize() called on strongly reachable object in Java 8”.
The bottom line is that you will be walking on thin ice when you try to fight the garbage collector. As the first sentence of the cite above states: “Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable.” Whereas we all may find ourselves being thinking too naively…
As said at the beginning, you may rethink the responsibilities. It’s worth noting that when your class has a close() method which has to be invoked to release the resource after all threads have finished their work, this required explicit action is already sufficient to prevent the early collection of the resource (assuming that the method is indeed called at the right point)…

Execution of Runnable in a thread pool is not enough to keep an object from being garbage collected. Even "this" can be collected! See JDK-8055183.
The following example shows that keepReference does not really keep it. Though the problem does not happen with vanilla JDK (because the compiler is not smart enough), it can be reproduced when a call to ThreadPoolExecutor.afterExecute is commented out. It is absolutely possible optimization, because afterExecute is no-op in the default ThreadPoolExecutor implementation.
import java.lang.ref.WeakReference;
import java.util.concurrent.*;
public class StrangeGC {
private static final ExecutorService someExecutor =
Executors.newSingleThreadExecutor();
private static void keepReference(final Object object) {
Runnable runnable = new Runnable() {
#SuppressWarnings("unused")
private Object localObject = object;
public void run() {
WeakReference<?> ref = new WeakReference<>(object);
if (ThreadLocalRandom.current().nextInt(1024) == 0) {
System.gc();
}
if (ref.get() == null) {
System.out.println("Object is garbage collected");
System.exit(0);
}
}
};
someExecutor.execute(runnable);
}
public static void main(String[] args) throws Exception {
while (true) {
keepReference(new Object());
}
}
}
Your hack with overriding afterExecute will work though.
You've basically invented a kind of Reachability Fence, see JDK-8133348.
The problem you've faced is known. It will be addressed in Java 9 as a part of JEP 193. There will be a standard API to explicitly mark objects as reachable: Reference.reachabilityFence(obj).
Update
Javadoc comments to Reference.reachabilityFence suggest synchronized block as an alternative construction to ensure reachability.

Prevent Java 7 from premature GC

Similar to Can JIT be prevented from optimising away method calls? I'm attempting to track memory usage of long-lived data store objects, however I'm finding that if I initialize a store, log the system memory, then initialize another store, sometimes the compiler (presumably the JIT) is smart enough to notice that these objects are no longer needed.
public class MemTest {
public static void main(String[] args) {
logMemory("Initial State");
MemoryHog mh = new MemoryHog();
logMemory("Built MemoryHog");
MemoryHog mh2 = new MemoryHog();
logMemory("Built Second MemoryHog"); // by here, mh may be GCed
}
}
Now the suggestion in the linked thread is to keep a pointer to these objects, but the GC appears to be smart enough to tell that the objects aren't used by main() anymore. I could add a call to these objects after the last logMemory() call, but that's a rather manual solution - every time I test an object, I have to do some sort of side-effect triggering call after the final logMemory() call, or I may get inconsistent results.
I'm looking for general case solutions; I understand that adding a call like System.out.println(mh.hashCode()+mh2.hashCode()) at the end of the main() method would be sufficient, but I dislike this for several reasons. First, it introduces an external dependency on the testing above - if the SOUT call is removed, the behavior of the JVM during the memory logging calls may change. Second, it's prone to user-error; if the objects being tested above change, or new ones are added, the user must remember to manually update this SOUT call as well, or they'll introduce difficult to detect inconsistencies in their test. Finally, I dislike that this solution prints at all - it seems like an unnecessary hack that I can avoid with a better understanding of the JIT's optimizations. To the last point, Patricia Shanahan's answer offers a reasonable solution (explicitly print that the output is for memory sanity purposes) but I'd still like to avoid it if possible.
So my initial solution is to store these objects in a static list, and then iterate over them in the main class's finalize method*, like so:
public class MemTest {
private static ArrayList<Object> objectHolder = new ArrayList<>();
public static void main(String[] args) {
logMemory("Initial State", null);
MemoryHog mh = new MemoryHog();
logMemory("Built MemoryHog", mh); // adds mh to objectHolder
MemoryHog mh2 = new MemoryHog();
logMemory("Built Second MemoryHog", mh2); // adds mh2 to objectHolder
}
protected void finalize() throws Throwable {
for(Object o : objectHolder) {
o.hashCode();
}
}
}
But now I've only offloaded the problem one step - what if the JIT optimizes away the loop in the finalize method, and decides these objects don't need to be saved? Admittedly, maybe simply holding the objects in the main class is enough for Java 7, but unless it's documented that the finalzie method can't be optimized away, there's still nothing theoretically preventing the JIT/GC from getting rid of these objects early, since there's no side effects in the contents of my finalize method.
One possibility would be to change the finalize method to:
protected void finalize() throws Throwable {
int codes = 0;
for(Object o : loggedObjects) {
codes += o.hashCode();
}
System.out.println(codes);
}
As I understand it (and I could be wrong here), calling System.out.println() will prevent the JIT from getting rid of this code, since it's a method with external side effects, so even though it doesn't impact the program, it can't be removed. This is promising, but I don't really want some sort of gibberish being output if I can help it. The fact that the JIT can't (or shouldn't!) optimize away System.out.println() calls suggests to me that the JIT has a notion of side effects, and if I can tell it this finalize block has such side effects, it should never optimize it away.
So my questions:
Is holdijng a list of objects in the main class enough to prevent them from ever being GCed?
Is looping over those objects and calling something trivial like .hashCode() in the finalize method enough?
Is computing and printing some result in this method enough?
Are there other methods (like System.out.println) the JIT is aware of that cannot be optimized away, or even better, is there some way to tell the JIT not to optimize away a method call / code block?
*Some quick testing confirms, as I suspected, that the JVM doesn't generally run the main class's finalize method, it abruptly exits. The JIT/GC may still not be smart enough to GC my objects simply because the finalize method exists, even if it doesn't get run, but I'm not confident that's always the case. If it's not documented behavior, I can't reasonably trust it will remain true, even if it's true now.

Here's a plan that may be overkill, but should be safe and reasonably simple:
Keep a List of references to the objects.
At the end, iterate over the list summing the hashCode() results.
Print the sum of the hash codes.
Printing the sum ensures that the final loop cannot be optimized out. The only thing you need to do for each object creation is put it in a List add call.

Yes, it would be legal for mh1 to be garbage collected at that point. At that point, there is no code that could possibly use the variable. If the JVM could detect this, then the corresponding MemoryHog object will be treated as unreachable ... if the GC were to run at that point.
A later call like System.out.println(mh1) would be sufficient to inhibit collection of the object. So would using it in a "computation"; e.g.
if (mh1 == mh2) { System.out.println("the sky is falling!"); }
Is holding a list of objects in the main class enough to prevent them from ever being GCed?
It depends on where the list is declared. If the list was a local variable, and it became unreachable before mh1, then putting the object into the list will make no difference.
Is looping over those objects and calling something trivial like .hashCode() in the finalize method enough?
By the time the finalize method is called, the GC has already decided that the object is unreachable. The only way that the finalize method could prevent the object being deleted would be to add it to some other (reachable) data structure or assign it to a (reachable) variable.
Are there other methods (like System.out.println) the JIT is aware of that cannot be optimized away,
Yea ... anything that makes the object reachable.
or even better, is there some way to tell the JIT not to optimize away a method call / code block?
No way to do that ... apart from making sure that the method call or code block does something that contributes to the computation being performed.
UPDATE
First, what is going on here is not really JIT optimization. Rather, the JIT is emitting some kind of "map" that the GC is using to determine when local variables (i.e. variables on the stack) are dead ... depending on the program counter (PC).
Your examples to inhibit collection all involve blocking the JIT via SOUT, I'd like to avoid that somewhat hacky solution.
Hey ... ANYTHING that depends on the exact timing of when things are garbage collected is a hack. You are not supposed to do that in a properly engineered application.
I updated my code to make it clear that the list that's holding my objects is a static variable of the main class, but it seems if the JIT's smart enough it could still theoretically GC these values once it knows the main method doesn't need them.
I disagree. In practice, the JIT cannot determine that a static will never be referenced. Consider these cases:
Before the JIT runs, it appears that nothing will use static s again. After the JIT has run, the application loads a new class that refers to s. If the JIT "optimized" the s variable, the GC would treat it as unreachable, and either null it or create a dangling references. When the dynamically loaded class then looked at s it would then see the wrong value ... or worse.
If the application ... or any libraries used by the application ... uses reflection, then it can refer to the value of any static variable without this being detectable by the JIT.
So while it is theoretically possible to do this optimization is a small number of cases:
in the vast majority of cases, you can't, and
in the few cases that you can, the pay-off (in terms of performance improvement) is most likely negligible.
I similarly updated my code to clarify that I'm talking about the finalize method of the main class.
The finalize method of the main class is irrelevant because:
you are not creating an instance of the main class, and
the finalize method CANNOT refer to the local variables of another method (e.g. the main method).
... it's existence prevents the JIT from nuking my static list.
Not true. The static list can't be nuked anyway; see above.
As I understand it, there's something special about SOUT that the JIT is aware of that prevents it from optimizing such calls away.
There is nothing special about sout. It is just something that we KNOW that influences the results of the computation and that we therefore KNOW that the JIT cannot legally optimize away.

Java reusing (static?) objects as temporary objects for performance

I need to call methods of a class with multiple methods very often in a simulation loop.
Some of these methods need to access temporary objects for storing information in them. After leaving the method the stored data is not needed anymore.
For example:
Class class {
method1() {
...
SomeObject temp = new SomeObject();
...
}
method2() {
...
SomeObject temp = new SomeObject();
SomeObject temp2 = new SomeObject();
...
}
}
I need to optimize as much as possible. The most expensive (removable) problem is that too many allocations happen.
I assume it would be better not to allocate the space needed for those objects every time so I want to keep them.
Would it be more efficient to store them in a static way or not?
Like:
Class class {
private (static?) SomeObject temp;
private (static?) SomeObject temp2;
methods...
}
Or is there even a better way? Thank you for your help!
Edit based on answers:
Not the memory footprint is the actual problem but the garbage collection cleaning up the mess.
SomeObject is a Point2D-like class, nothing memory expensive (in my opinion).
I am not sure whether it is better to use (eventually static) class level objects as placeholder or some more advanced method which I am not aware of.

I would be wary in this example of pre-mature optimization. There are downsides, typically, that it makes the code more complex (and complexity makes bugs more likely), harder to read, could introduce bugs, may not offer the speedup you expected, etc. For a simple object such as representing a 2D point coordinate, I wouldn't worry about re-use. Typically re-use gains the most benefit if you are either working with a large amount of memory, avoid lengthy expensive constructors, or are pulling object construction out of a tight loop that is frequently executed.
Some different strategies you could try:
Push responsiblity to caller One way would be to to have the caller pass in an object pre-initialized, making the method parameter final. However, whether this will work depends on what you need to do with the object.
Pointer to temporary object as method parameter Another way would be to have the caller pass as an object as a parameter that's purpose is essentially to be a pointer to an object where the method should do its temporary storage. I think this technique is more commonly used in C++, but works similarly, though sometimes shows up in places like graphics programming.
Object Pool One common way to reuse temporary objects is to use an object pool where objects are allocated from a fixed bank of "available" objects. This has some overhead, but if the objects are large, and frequently used for only short periods of time, such that memory fragmentation might be a concern, the overhead may be enough less to be worth considering.
Member Variable If you are not concerned about concurrent calls to the method (or have used synchronization to prevent such), you could emulate the C++ism of a "local static" variable, by creating a member variable of the class for your storage. It makes the code less readable and slightly more room to introduce accidental interference with other parts of your code using the variable, but lower overhead than an object pool, and does not require changes to your method signature. If you do this, you may optionally also wish to use the transient keyword on the variable as well to indicate the variable does not need to be serialized.
I would shy away from a static variable for the temporary unless the method is also static, because this may have a memory overhead for the entire time your program runs that is undesirable, and the same downsides as a member variable for this purpose x2 (multiple instances of the same class)

Keep in mind that temp and temp2 are not themselves objects, but variables pointing to an object of type SomeObject. The way you are planning to do it, the only difference would be that temp and temp2 would be instance variables instead of local variables. Calling
temp = new SomeObject();
Would still allocate a new SomeObject onto the heap.
Additionally, making them static or instance variables instead of local would cause the last assigned SomeObjects to be kept strongly reachable (as long as your class instance is in scope for instance variables), preventing them from being garbage collected until the variables are reassigned.
Optimizing in this way probably isn't effective. Currently, once temp and temp2 are out of scope, the SomeObjects they point to will be eligible for garbage collection.
If you're still interested in memory optimization, you will need to show what the SomeObject is in order to get advice as to how you could cache the information it's holding.

How large are these objects. It seems to me that you could have class level objects (not necessarily static. I'll come back to that). For SomeObject, you could have a method that purges its contents. When you are done using it in one place, call the method to purge its contents.
As far as static, will multiple callers use this class and have different values? If so, don't use static.

First, you need to make sure that you are really have this problem. The benefit of a Garbage Collector is that it takes care of all temporary objects automatically.
Anyways, suppose you run a single threaded application and you use at most MAX_OBJECTS at any giving time. One solution could be like this:
public class ObjectPool {
private final int MAX_OBJECTS = 5;
private final Object [] pool = new Object [MAX_OBJECTS];
private int position = 0;
public Object getObject() {
// advance to the next object
position = (position + 1) % MAX_OBJECTS;
// check and create new object if needed
if(pool[position] == null) {
pool[position] = new Object();
}
// return next object
return pool[position];
}
// make it a singleton
private ObjectPool() {}
private static final ObjectPool instance = new ObjectPool();
public static ObjectPool getInstance() { return instance;}
}
And here is the usage example:
public class ObjectPoolTest {
public static void main(String[] args) {
for(int n = 0; n < 6; n++) {
Object o = ObjectPool.getInstance().getObject();
System.out.println(o.hashCode());
}
}
}
Here is the output:
0) 1660364311
1) 1340465859
2) 2106235183
3) 374283533
4) 603737068
5) 1660364311
You can notice that the first and the last numbers are the same - the MAX_OBJECTS + 1 iterations returns the same temporary object.

Choosing when to instantiate classes

I recently wrote a class for an assignment in which I had to store names in an ArrayList (in java). I initialized the ArrayList as an instance variable private ArrayList<String> names. Later when I checked my work against the solution, I noticed that they had initialized their ArrayList in the run() method instead.
I thought about this for a bit and I kind of feel it might be a matter of taste, but in general how does one choose in situations like this? Does one take up less memory or something?
PS I like the instance variables in Ruby that start with an # symbol: they are lovelier.
(meta-question: What would be a better title for this question?)

In the words of the great Knuth "Premature optimization is the root of all evil".
Just worry that your program functions correctly and that it does not have bugs. This is far more important than an obscure optimization that will be hard to debug later on.
But to answer your question - if you initialize in the class member, the memory will be allocated the first time a mention of your class is done in the code (i.e. when you call a method from it). If you initialize in a method, the memory allocation occurs later, when you call this specific method.
So it is only a question of initializing later... this is called lazy initialization in the industry.

Initialization
As a rule of thumb, try to initialize variables when they are declared.
If the value of a variable is intended never to change, make that explicit with use of the final keyword. This helps you reason about the correctness of your code, and while I'm not aware of compiler or JVM optimizations that recognize the final keyword, they would certainly be possible.
Of course, there are exceptions to this rule. For example, a variable may by be assigned in an if–else or a switch. In a case like that, a "blank" declaration (one with no initialization) is preferable to an initialization that is guaranteed to be overwritten before the dummy value is read.
/* DON'T DO THIS! */
Color color = null;
switch(colorCode) {
case RED: color = new Color("crimson"); break;
case GREEN: color = new Color("lime"); break;
case BLUE: color = new Color("azure"); break;
}
color.fill(widget);
Now you have a NullPointerException if an unrecognized color code is presented. It would be better not to assign the meaningless null. The compiler would produce an error at the color.fill() call, because it would detect that you might not have initialized color.
In order to answer your question in this case, I'd have to see the code in question. If the solution initialized it inside the run() method, it must have been used either as temporary storage, or as a way to "return" the results of the task.
If the collection is used as temporary storage, and isn't accessible outside of the method, it should be declared as a local variable, not an instance variable, and most likely, should be initialized where it's declared in the method.
Concurrency Issues
For a beginning programming course, your instructor probably wasn't trying to confront you with the complexities of concurrent programming—although if that's the case, I'm not sure why you were using a Thread. But, with current trends in CPU design, anyone who is learning to program needs to have a firm grasp on concurrency. I'll try to delve a little deeper here.
Returning results from a thread's run method is a bit tricky. This method is the Runnable interface, and there's nothing stopping multiple threads from executing the run method of a single instance. The resulting concurrency issues are part of the motivation behind the Callable interface introduced in Java 5. It's much like Runnable, but can return a result in a thread-safe manner, and throw an Exception if the task can't be executed.
It's a bit of a digression, but if you are curious, consider the following example:
class Oops extends Thread { /* Note that thread implements "Runnable" */
private int counter = 0;
private Collection<Integer> state = ...;
public void run() {
state.add(counter);
counter++;
}
public static void main(String... argv) throws Exception {
Oops oops = new Oops();
oops.start();
Thread t2 = new Thread(oops); /* Now pass the same Runnable to a new Thread. */
t2.start(); /* Execute the "run" method of the same instance again. */
...
}
}
By the end of the the main method you pretty much have no idea what the "state" of the Collection is. Two threads are working on it concurrently, and we haven't specified whether the collection is safe for concurrent use. If we initialize it inside the thread, at least we can say that eventually, state will contain one element, but we can't say whether it's 0 or 1.

From wikibooks:
There are three basic kinds of scope for variables in Java:
local variable, declared within a method in a class, valid for (and occupying storage only for) the time that method is executing. Every time the method is called, a new copy of the variable is used.
instance variable, declared within a class but outside any method. It is valid for and occupies storage for as long as the corresponding object is in memory; a program can instantiate multiple objects of the class, and each one gets its own copy of all instance variables. This is the basic data structure rule of Object-Oriented programming; classes are defined to hold data specific to a "class of objects" in a given system, and each instance holds its own data.
static variable, declared within a class as static, outside any method. There is only one copy of such a variable no matter how many objects are instantiated from that class.
So yes, memory consumption is an issue, especially if the ArrayList inside run() is local.

I am not completely I understand your complete problem.
But as far as I understand it right now, the performance/memory benefit will be rather minor. Therefore I would definitely favour the easibility side.
So do what suits you the best. Only address performance/memory optimisation when needed.

My personal rule of thumb for instance variables is to initialize them, at least with a default value, either:
at delcaration time, i.e.
private ArrayList<String> myStrings = new ArrayList<String>();
in the constructor
If it's something that really is an instance variable, and represents state of the object, it is then completely initialized by the time the constructor exits. Otherwise, you open yourself to the possibility of trying to access the variable before it has a value. Of course, that doesn't apply to primitives where you will get a default value automatically.
For static (class-level) variables, initialize them in the declaration or in a static initializer. I use a static initializer if I have do calculations or other work to get a value. Initialize in the declaration if you're just calling new Foo() or setting the variable to a known value.

You have to avoid Lazy initialization. It leads to problems later.
But if you have to do it because the initialization is too heavy you have to do it like this:
Static fields:
// Lazy initialization holder class idiom for static fields
private static class FieldHolder {
static final FieldType field = computeFieldValue();
}
static FieldType getField() { return FieldHolder.field; }
Instance fields:
// Double-check idiom for lazy initialization of instance fields
private volatile FieldType field;
FieldType getField() {
FieldType result = field;
if (result == null) { // First check (no locking)
synchronized(this) {
result = field;
if (result == null) // Second check (with locking)
field = result = computeFieldValue();
}
}
return result;
}
Acording to Joshua Bolch book's "Effective Java™
Second Edition" (ISBN-13: 978-0-321-35668-0):
"Use lazy initialization judiciously"

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.