Why exactly PhantomReference should be preferred to finalize? - java

They both can be used for cleanup, there is almost no guarantees, but PR requires more harness coding. So, having two options, why exactly I have to prefer one to another?
Javadoc 9 describes finalize as very problematic, but that doesn't make its alternative better automatically, right?
Also javadoc describes PhantomReference as providing "more flexible and efficient ways to release resources when an object becomes unreachable", but without a reason specified. Well, I guess these guys know some secrets, but I'm wondering - can't be this choice made more obvious?
The difference
Here are all the differences between finalize (FZ) and pantom reference (PR) I discovered, please correct me if I missed something.
Can be used for cleanup actions?
Yes for both.
Requires a new thread to maintain?
PR: yes, you must define a queue watcher thread in order to do cleanup ASAP
FZ: no
Requires a new class to define?
PR: yes, you must extend PhantomReference to act meaningfully
FZ: no
Can cleanup processor access the referent object?
PR: no
FZ: yes, and that's handy
Does it work reliably in my personal practice?
Yes for both.
Can lead to performance issues, deadlocks, and hangs?
Yes for both. Depends on your code, isn't?
Can errors in a cleanup processor lead to resource leaks?
Yes for both. Depends on your code, isn't?
Cancelable if it is no longer necessary?
PR: yes
FZ: no, if speaking strictly, but is immediate return that bad?
Is invocation ordering between multiple instances specified?
PR: no info
FZ: no - "no ordering is specified among calls to finalize methods of different objects" (java.lang.Object)
Invocation guaranteed?
PR: no info - you can only "request to be notified of changes in an object's reachability" (java.lang.ref)
FZ: no - "The finalize method might be called on a finalizable object only after an indefinite delay, if at all" (java.lang.Object)
Any guarantees regarding the timing?
PR: no - "Some time after the garbage collector determines that the reachability of the referent has changed to the value corresponding to the type of the reference" (java.lang.ref)
FZ: no - "The Java programming language does not specify how soon a finalizer will be invoked" (JLS), "The finalize method might be called on a finalizable object only after an indefinite delay, if at all" (java.lang.Object)
Can this resurrect during processing?
PR: no, and that's not bad
FZ: yes, officially supported
Links:
Java 9 phantom reference
Java 9 finalize
JLS on finalize

Most of this has been addressed in Should Java 9 Cleaner be preferred to finalization? already. Since the Cleaner API builds on the PhantomReference, most of it applies as well to using PhantomReference directly.
In short, you are not supposed to replace finalize() usages with PhantomReference or Cleaner. For non-memory resources, you should prefer explicit closing right after using them, with the try-with-resources construct wherever feasible. The interaction with the garbage collector may act as a fallback to detect programming errors, but should not become the preferred way of resource cleanup.
In that regard, the ability to opt out the cleanup when the resource has been closed correctly, has significant relevance, as it will be the norm. You are underestimating the impact of it. Once, your class has a nontrivial finalizer, its objects require two garbage collection cycles, to get reclaimed, even when the finalize() immediately returns after checking a condition. This can make the difference between getting collected in a minor gc or being promoted to the old generation.
The most extreme example would be a shortly used resource represented by a purely local object whose memory allocation could get elided completely after Escape Analysis has been applied, whereas the presence of a nontrivial finalize() method invariably implies a global escape which prevents this optimization (amongst others, like lock elimination).
While the Cleaner API does start a dedicated thread, there is no requirement to do so when using PhantomReference. You could as well poll the queue right when the resource is used or a new one is about to be allocated. That doesn’t guaranty fast release of the resource (which gc triggered cleanup doesn’t guaranty anyway), but you’re ensuring that an allocation doesn’t fail while a collected object unnecessarily holds resources.
Even if you use dedicated threads for the cleanup, there’s a fundamental difference between starting threads under your control and having finalizers invoked by unspecified JVM threads outside your control, where a faulty finalize() method of another library could block the thread needed for your cleanup. The JVM may invoke multiple finalizer concurrently whereas you can decide how many threads you use for your PhantomReference based cleanup.

Related

Should Java 9 Cleaner be preferred to finalization?

In Java, overriding the finalize method gets a bad rap, although I don't understand why. Classes like FileInputStream use it to ensure close gets called, in both Java 8 and Java 10. Nevertheless, Java 9 introduced java.lang.ref.Cleaner which uses the PhantomReference mechanism instead of GC finalization. At first, I thought it was just a way add finalization to third-party classes. However, the example given in its javadoc shows a use-case that can easily be rewritten with a finalizer.
Should I be rewriting all of my finalize methods in terms of Cleaner? (I don't have many, of course. Just some classes that use OS resources particularly for CUDA interop.)
As I can tell, Cleaner (via PhantomReference) avoids some of the dangers of finalizer. In particular, you don't have any access to the cleaned object and so you can't resurrect it or any of its fields.
However, that is the only advantage I can see. Cleaner is also non-trivial. In fact, it and finalization both use a ReferenceQueue! (Don't you just love how easy it is to read the JDK?) Is it faster than finalization? Does it avoid waiting for two GCs? Will it avoid heap exhaustion if many objects are queued for cleanup? (The answer to all of those would appear to me to be no.)
Finally, there's actually nothing guaranteeing to stop you from referencing the target object in the cleaning action. Be careful to read the long API Note! If you do end up referencing the object, the whole mechanism will silently break, unlike finalization which always tries to limp along. Finally, while the finalization thread is managed by the JVM, creating and holding Cleaner threads is your own responsibility.
You are not supposed to replace all finalize() methods with a Cleaner. The fact that the deprecation of finalize() method and the introduction of (a public) Cleaner happened in the same Java version, only indicates that a general work on the topic happened, not that one is supposed to be a substitute of the other.
Other related work of that Java version is the removal of the rule that a PhantomReference is not automatically cleared (yes, before Java 9, using a PhantomReference instead of finalize() still required two GC cycles to reclaim the object) and the introduction of Reference.reachabilityFence(…).
The first alternative to finalize(), is not to have a garbage collection dependent operation at all. It’s good when you say that you don’t have many, but I’ve seen entirely obsolete finalize() methods in the wild. The problem is that finalize() looks like an ordinary protected method and the tenacious myth that finalize() was some kind of destructor still is spread on some internet pages. Marking it deprecated allows to signal to the developer that this is not the case, without breaking compatibility. Using a mechanism requiring explicit registration helps understanding that this is not the normal program flow. And it doesn’t hurt when it looks more complicated than overriding a single method.
In case your class does encapsulate a non-heap resource, the documentation states:
Classes whose instances hold non-heap resources should provide a method to enable explicit release of those resources, and they should also implement AutoCloseable if appropriate.
(so that’s the preferred solution)
The Cleaner and PhantomReference provide more flexible and efficient ways to release resources when an object becomes unreachable.
So when you truly need interaction with the garbage collector, even this brief documentation comment names two alternatives, as PhantomReference is not mentioned as the hidden-from-developer backend of Cleaner here; using PhantomReference directly is an alternative to Cleaner, which might be even more complicated to use, but also provides even more control over timing and threads, including the possibility to cleanup within the same thread which used the resource. (Compare to WeakHashMap, which has such cleanup avoiding the expenses of thread safe constructs). It also allows dealing with exceptions thrown during the cleanup, in a better way than silently swallowing them.
But even Cleaner solves more problems that you are aware of.
A significant problem, is the time of registration.
An object of a class with a nontrivial finalize() method is registered when the Object() constructor has been executed. At this point, the object has not been initialized yet. If your initialization is terminated with an exception, the finalize() method still will be called. It might be tempting to solve this by the object’s data, e.g. setting an initialized flag to true, but you can only say this for your own instance data, but not for data of a subclass, which still has not been initialized when your constructor returns.
Registering a cleaner requires a fully constructed Runnable holding all necessary data for the cleanup, without a reference to the object under construction. You may even defer the registration when the resource allocation did not happen in the constructor (think of an unbound Socket instance or a Frame which is not atomically connected to a display)
A finalize() method can be overridden, without calling the superclass method or failing to do this in the exceptional case. Preventing the method from overriding, by declaring it final, does not allow the subclasses to have such cleanup actions at all. In contrast, every class may register cleaners without interference to the other cleaners.
Granted, you could have solved such issues with encapsulated objects, however, the design of having a finalize() method for every class guided to the other, wrong direction.
As you already discovered, there is a clean() method, which allows to perform the cleanup action immediately and removing the cleaner. So when providing an explicit close method or even implementing AutoClosable, this is the preferred way of cleanup, timely disposing the resource and getting rid of all the problems of garbage collector based cleanup.
Note that this harmonizes with the points mentioned above. There can be multiple cleaners for an object, e.g. registered by different classes in the hierarchy. Each of them can be triggered individually, with an intrinsic solution regarding access rights, only who registered the cleaner gets hands on the associated Cleanable to be able to invoke the clean() method.
That said, it is often overlooked that the worst thing that can happen when managing resources with the garbage collector, is not that the cleanup action may run later or never at all. The worst thing that can happen, is that it runs too early. See finalize() called on strongly reachable object in Java 8 for example. Or, a really nice one, JDK-8145304, Executors.newSingleThreadExecutor().submit(runnable) throws RejectedExecutionException, where a finalizer shuts down the executor service still in use.
Granted, just using Cleaner or PhantomReference does not solve this. But removing finalizers and implementing an alternative mechanism when truly needed, is an opportunity to carefully think about the topic and perhaps insert reachabilityFences where needed. The worst thing you can have, is a method that looks like being easy-to-use, when in fact, the topic is horribly complex and 99% of its use are potentially breaking some day.
Further, while the alternatives are more complex, you said yourself, they are rarely needed. This complexity should only affect a fraction of your code base. Any why should java.lang.Object, the base class for all classes, host a method addressing a rare corner case of Java programming?
As pointed out by Elliott in comments, moving ahead with Java9+, the Object.finalize is deprecated and hence it makes more sense to implement methods using Cleaner. Also, from the release notes :
The java.lang.Object.finalize method has been deprecated. The
finalization mechanism is inherently problematic and can lead to
performance issues, deadlocks, and hangs. The java.lang.ref.Cleaner
and java.lang.ref.PhantomReference provide more flexible and efficient
ways to release resources when an object becomes unreachable.
Details in Bug Database - JDK-8165641
Use neither.
Trying to recover from resource leaks using Cleaner presents nearly as many challenges as finalize the worst of which, as mentioned by Holger, is premature finalization (which is a problem not only with finalize but with every kind of soft/weak/phantom reference). Even if you do your best to implement finalization correctly (and, again, I mean any kind of system that uses a soft/weak/phantom reference), you can never guarantee that the resource leaks won't lead to resource exhaustion. The unavoidable fact is that the GC doesn't know about your resources.
Instead, you should assume that resources will be closed correctly (via AutoCloseable, try-with-resources, reference counting, etc.), find and fix bugs rather than hope to work around them, and use finalization (in any of its forms) only as a debugging aid, much like assert.
Resource leaks must be fixed--not worked around.
Finalization should only be used as an assertion mechanism to (try to) notify you that a bug exists. To that end, I suggest taking a look at the Netty-derived almson-refcount. It offers an efficient resource leak detector based on weak references, and an optional reference-counting facility that is more flexible than the usual AutoCloseable. What makes its leak detector great is that it offers different levels of tracking (with different amounts of overhead) and you can use it to capture stack traces of where your leaked objects are allocated and used.
Java 9's Cleaner is very similar to traditional finalization (as implemented in OpenJDK), and almost everything (good or bad) that can be said about finalization can be said about Cleaner. Both rely on the garbage collector to place Reference objects on a ReferenceQueue and use a separate thread to run the cleanup methods.
The three main differences are that Cleaner uses PhantomReference instead of what is essentially a WeakReference (phantom reference doesn't allow you to access the object, which ensures it cannot be made reachable, ie zombified), uses a separate thread per Cleaner with a customizable ThreadFactory, and allows the PhantomReferences to be cleared (ie, cancelled) manually and never enqueued.
This provides performance advantages when heavy use is made of Cleaner/finalization. (Unfortunately, I don't have benchmarks to say how much of an advantage.) However, making heavy use of finalization is not normal.
For the normal things that finalize is used for--ie, a last-resort clean-up mechanism for native resources implemented with small, final objects that hold the minimum necessary state, provide AutoCloseable, and aren't allocated millions per second--there is no practical difference between the two approaches other than usage differences (in some aspects finalize is simpler to implement, in others Cleaner helps avoid mistakes). Cleaner doesn't provide any additional guarantees or behaviors (such as guaranteeing that cleaners will run prior to the process exiting--which is essentially impossible to guarantee anyway).
However, finalize has been deprecated. So that's that, I guess. Kind of a dick move. Perhaps the JDK developers are thinking, "why should the JDK provide a native mechanism that can easily be implemented as a library" "n00bs. n00bs everywhere. n00bs, stop using finalize, we hate you so much." It is a good point--and yet, I can't imagine finalize actually disappearing.
A good article that talks about finalization and outlines how alternative-finalization works can be found here: How to Handle Java Finalization's Memory-Retention Issues It paints in broad strokes how Cleaner works.
An example of the kind of code that might use Cleaner or PhantomReference instead of finalize is Netty's reference-counted manual management of direct (non-heap) memory. There, a lot of finalizeable objects get allocated and the alternative-finalization mechanism adopted by Netty makes sense. However, Netty goes a step farther and doesn't create a Reference for each reference-counted object unless the leak detector is set to its highest sensitivity. During usual operation it either doesn't use finalization at all (because if there is a resource leak, you're going to find out about it eventually anyway) or uses sampling (attaches clean-up code to a small fraction of allocated objects).
Netty's ResourceLeakDetector is much cooler than Cleaner.

Should Java finalizer really be avoided also for native peer objects lifecycle management?

In my experience as a C++/Java/Android developer, I have come to learn that finalizers are almost always a bad idea, the only exception being the management of a "native peer" object needed by the java one to call C/C++ code through JNI.
I am aware of the JNI: Properly manage the lifetime of a java object question, but this question addresses the reasons not to use a finalizer anyway, neither for native peers. So it's a question/discussion on a confutation of the answers in the aforementioned question.
Joshua Bloch in his Effective Java explicitly lists this case as an exception to his famous advice on not using finalizers:
A second legitimate use of finalizers concerns objects with native peers. A native peer is a native object to which a normal object delegates via native methods. Because a native peer is not a normal object, the garbage collector doesn't know about it and can’t reclaim it when its Java peer is reclaimed. A finalizer is an appropriate vehicle for performing this task, assuming the native peer holds no critical resources. If the native peer holds resources that must be terminated promptly, the class should have an explicit termination method, as described above. The termination method should do whatever is required to free the critical resource. The termination method can be a native method, or it can invoke one.
(Also see "Why is the finalized method included in Java?" question on stackexchange)
Then I watched the really interesting How to manage native memory in Android talk at the Google I/O '17, where Hans Boehm actually advocates against using finalizers to manage native peers of a java object, also citing Effective Java as a reference. After quickly mentioning why explicit delete of the native peer or automatic closing based on scope might not be a viable alternative, he advises using java.lang.ref.PhantomReference instead.
He makes some interesting points, but I am not completely convinced. I will try to run through some of them and state my doubts, hoping someone can shed further light on them.
Starting from this example:
class BinaryPoly {
long mNativeHandle; // holds a c++ raw pointer
private BinaryPoly(long nativeHandle) {
mNativeHandle = nativeHandle;
}
private static native long nativeMultiply(long xCppPtr, long yCppPtr);
BinaryPoly multiply(BinaryPoly other) {
return new BinaryPoly ( nativeMultiply(mNativeHandle, other.mNativeHandler) );
}
// …
static native void nativeDelete (long cppPtr);
protected void finalize() {
nativeDelete(mNativeHandle);
}
}
Where a java class holds a reference to a native peer that gets deleted in the finalizer method, Bloch lists the shortcomings of such an approach.
Finalizers can run in arbitrary order
If two objects become unreachable, the finalizers actually run in arbitrary order, that includes the case when two objects who point to each others become unreachable at the same time they can be finalized in the wrong order, meaning that the second one to be finalized actually tries to access an object that’s already been finalized. [...] As a result of that you can get dangling pointers and see deallocated c++ objects [...]
And as an example:
class SomeClass {
BinaryPoly mMyBinaryPoly:
…
// DEFINITELY DON’T DO THIS WITH CURRENT BinaryPoly!
protected void finalize() {
Log.v(“BPC”, “Dropped + … + myBinaryPoly.toString());
}
}
Ok, but isn't this true also if myBinaryPoly is a pure Java object? As I understand it , the problem comes from operating on a possibly finalized object inside its owner's finalizer. In case we are only using the finalizer of an object to delete its own private native peer and not doing anything else, we should be fine, right?
Finalizer may be invoked while the native method is till running
By Java rules, but not currently on Android:
Object x’s finalizer may be invoked while one of x’s methods is still running, and accessing the native object.
Pseudo-code of what multiply() gets compiled to is shown to explain this:
BinaryPoly multiply(BinaryPoly other) {
long tmpx = this.mNativeHandle; // last use of “this”
long tmpy = other.mNativeHandle; // last use of other
BinaryPoly result = new BinaryPoly();
// GC happens here. “this” and “other” can be reclaimed and finalized.
// tmpx and tmpy are still needed. But finalizer can delete tmpx and tmpy here!
result.mNativeHandle = nativeMultiply(tmpx, tmpy)
return result;
}
This is scary, and I am actually relieved this doesn't happen on android, because what I understand is that this and other get garbage collected before they go out of scope! This is even weirder considering that this is the object the method is called on, and that other is the argument of the method, so they both should already "be alive" in the scope where the method is being called.
A quick workaround to this would be to call some dummy methods on both this and other (ugly!), or passing them to the native method (where we can then retrieve the mNativeHandle and operate on it). And wait... this is already by default one of the arguments of the native method!
JNIEXPORT void JNICALL Java_package_BinaryPoly_multiply
(JNIEnv* env, jobject thiz, jlong xPtr, jlong yPtr) {}
How can this be possibly garbage collected?
Finalizers can be deferred for too long
“For this to work correctly, if you run an application that allocates lots of native memory and relatively little java memory it may actually not be the case that the garbage collector runs promptly enough to actually invoke finalizers [...] so you actually may have to invoke System.gc() and System.runFinalization() occasionally, which is tricky to do [...]”
If the native peer is only seen by a single java object which it is tied to, isn’t this fact transparent to the rest of the system, and thus the GC should just have to manage the lifecycle of the Java object as it was a pure java one? There's clearly something I fail to see here.
Finalizers can actually extend the lifetime of the java object
[...] Sometimes finalizers actually extend the lifetime of the java object for another garbage collection cycle, which means for generational garbage collectors they may actually cause it to survive into the old generation and the lifetime may be greatly extended as a result of just having a finalizer.
I admit I don't really get what's the issue here and how it relates to having a native peer, I will make some research and possibly update the question :)
In conclusion
For now, I still believe that using a sort of RAII approach were the native peer is created in the java object's constructor and deleted in the finalize method is not actually dangerous, provided that:
the native peer doesn't hold any critical resource (in that case there should be a separate method to release the resource, the native peer must only act as the the java object "counterpart" in the native realm)
the native peer doesn't span threads or do weird concurrent stuff in its destructor (who would want to do that?!?)
the native peer pointer is never shared outside the java object, only belongs to a single instance, and only accessed inside the java object's methods. On Android, a java object may access the native peer of another instance of the same class, right before calling a jni method accepting different native peers or, better, just passing the java objects to the native method itself
the java object's finalizer only deletes its own native peer, and does nothing else
Is there any other restriction that should be added, or there's really no way to ensure that a finalizer is safe even with all restrictions being respected?
finalize and other approaches that use GC knowledge of objects lifetime have a couple of nuances:
visibility: do you guarantee that all the writes methods of object o made are visible to the finalizer (i.e., there is a happens-before relationship between the last action on object o and the code performing finalization)?
reachability: how do you guarantee, that an object o isn't destroyed prematurely (e.g., whilst one of its methods is running), which is allowed by the JLS? It does happen and cause crashes.
ordering: can you enforce a certain order in which objects are finalized?
termination: do you need to destroy all the objects when your app terminates?
throughput: GC-based approaches offer significantly smaller deallocation throughput than the deterministic approach.
It is possible to solve all of these issues with finalizers, but it requires a decent amount of code. Hans-J. Boehm has a great presentation which shows these issues and possible solutions.
To guarantee visibility, you have to synchronize your code, i.e., put operations with Release semantics in your regular methods, and an operation with Acquire semantics in your finalizer. For example:
A store in a volatile at the end of each method + read of the same volatile in a finalizer.
Release lock on the object at the end of each method + acquire the lock at the beginning of a finalizer (see keepAlive implementation in Boehm's slides).
To guarantee reachability (when it's not already guaranteed by the language specification), you may use:
Synchronization approaches described above also ensure reachability.
Pass references to the objects that must remain reachable (= non-finalizable) as arguments to native methods. In the talk you reference, nativeMultiply is static, therefore this may be garbage-collected.
Reference#reachabilityFence from Java 9+.
The difference between plain finalize and PhantomReferences is that the latter gives you way more control over the various aspects of finalization:
Can have multiple queues receiving phantom refs and pick a thread performing finalization for each of them.
Can finalize in the same thread that did allocation (e.g., thread local ReferenceQueues).
Easier to enforce ordering: keep a strong reference to an object B that must remain alive when A is finalized as a field of PhantomReference to A;
Easier to implement safe termination, as you must keep PhantomRefereces strongly reachable until they are enqueued by GC.
My own take is that one should release native objects as soon as you are done with them, in a deterministic fashion. As such, using scope to manage them is preferable to relying on the finalizer. You can use the finalizer to cleanup as a last resort, but, i would not use solely to manage the actual lifetime for the reasons you actually pointed out in your own question.
As such, let the finalizer be the final attempt, but not the first.
I think most of this debate stems from the legacy status of finalize(). It was introduced in Java to address things that garbage collection didn't cover, but not necessarily things like system resources (files, network connections, etc.) so it always felt kind of half baked. I don't necessarily agree with using something like phantomreference, which professes to be a better finalizer than finalize() when the pattern itself is problematic.
Hugues Moreau pointed out that finalize() will be deprecated in Java 9. The preferred pattern of the Java team appears to be treating things like native peers as a system resource and cleaning them up via try-with-resources. Implementing AutoCloseable allows you to do this. Note that try-with-resources and AutoCloseable post-date both Josh Bloch's direct involvement with Java and Effective Java 2nd edition.
see https://github.com/android/platform_frameworks_base/blob/master/graphics/java/android/graphics/Bitmap.java#L135
use phantomreference instead of finalizer
How can this be possibly garbage collected?
Because function nativeMultiply(long xCppPtr, long yCppPtr) is static. If a native function is static, its second parameter is jclass pointing to its class instead of jobject pointing to this. So in this case this is not one of the arguments.
If it had not been static there would be only issue with the other object.
Let me come up with a provocative proposal. If your C++ side of a managed Java object can be allocated in contiguous memory, then instead of the traditional long native pointer, you can use a DirectByteBuffer. This may really be a game changer: now GC can be smart enough about these small Java wrappers around huge native data structures (e.g. decide to collect it earlier).
Unfortunately, most real life C++ objects don't fall into this category...

How does finalize() work in java?

So, I recently discovered the finalize method in Java (not sure why I missed it before, but there it is). This seems like it could be the answer to a lot of the issues I'm working with, but I wanted to get a bit more information first.
Online, I found this diagram illustrating the process of garbage collection and finalize:
A couple of questions:
This takes place in a separate thread, correct?
What happens if I instantiate a new object during finalize? Is that allowed?
What happens if I call on a static method from finalize?
What happens if I establish a new reference to the object from within finalize?
I suppose I should explain why I'm interested. I work with LWJGL a lot, and it seems that if I could use finalize to cause Java objects to automatically clean up OpenGL resources, then I could do some really nice things in terms of an API.
finalize() is called by the Java Garbage Collector when it detects that no references to that particular object exists. finalize() is inherited by all Java objects through the Object class.
As far as I am aware you would have no difficulty making static method calls from a finalize() method I and you could establish a new reference to it from finalize() - however I would say this is poor programming practice.
You shouldn't rely on finalize() for clearing up, and it is better to clear up as you go. I prefer to use try, catch, finally for clearing up, rather than using finalize(). In particular, by using finalize() you will cause the JVM to hold onto all other objects that your finalizable object references, just in case it makes calls to them. This means your holding onto memory you might not need to use. More importantly, this also means you can cause the JVM to never end up disposing of objects, because they have to hold onto them incase another objects finalize method needs it e.g. a race condition.
Also, consider that it is entirely possible that GC won't be called. Therefore you can't actually guarantee that finalize() will ever be called.
Clear up resources as and when you are finished with them, and do not rely on finalize() to do it is my advice.
I don't think there are any guarantees about what thread will be used. New objects may be instantiated and static methods may be called. Establishing a new reference to your object will prevent it from being garbage collected, but the finalize method will not be called again--you don't want to do this.
Cleaning up resources is precisely what the finalize method is for, so you should be good there. A couple of warnings, though:
The method is not guaranteed to be called. If you have tied up resources that will not automatically be freed when your program stops do not depend on finalize.
When the method is called is not guaranteed. With memory tight, this will be sooner. With lots of free memory, it will be later if at all. This may suit you fine: with lots of memory you may not be all that concerned about freeing up the resources. (Though hanging on to them may interfere with other software running at the same time, in which case you would be concerned.)
My ususal solution is to have some kind of dispose method that does the clean up. I call it explicitly at some point if I can, and as soon as I can. Then I add a finalize method that just calls the dispose method. (Note that the dispose method must behave well when when called more than once! Indeed, with this kind of programming I might call dispose several times outside finalize, not being sure if previous calls were made successfully and yet wanting it to be called effectively as soon as possible.) Now, ideally, my resources are freed as soon as I no longer need them. However, if I lose track of the object with the resources, the finalize method will bail me out when memory runs short and I need the help.
First of all, remember there is no guarantee that finalization will even be run at all for all your objects. You can use it to free memory allocated in native code associated with an object, but for pure Java code most use cases are only to perform a "backup" mechanism of cleaning up resources. This means in most cases you should free resources manually and finalizers could act only a sort of helper to clean up if you forget to do it the standard way. However, you can't use them as the only or the main mechanism of cleanup. Even more generally, you shouldn't write any code whose correctness depends on finalizers being run.
Ad 1. As far as I know, there are no guarantees about what thread calls finalize(), though in practice this will probably be one of the GC threads.
Ad 2. Instantiating new objects is allowed. However, there are a number of pitfalls with handling object references in finalizers. In particular, if you store a hard reference to the object being finalized in some live object, you can prevent your about-to-be-garbage-collected object from being cleaned up. This kind of object resurrection may lead to exhausting your resources if it gets out of control. Also, watch out for exceptions in finalize() - they may halt the finalization, but there's no automatic way for your program to learn about them. You need to wrap the code in try-catch blocks and propagate the information yourself. Also, long execution time of finalizers may cause the queue of objects to build up and consume lots of memory. Some other noteworthy problems and limitations are described in this JavaWorld article.
Ad 3. There shouldn't be any issues with calling static methods from finalizers.
Ad 4. As mentioned in point 2, it is possible to prevent an object from being garbage collected (to resurrect it) by placing a reference to it in another live object during finalization. However, this is tricky behavior and probably not good practice.
To sum up, you can't rely on finalizers for cleaning up your resources. You need to handle that manually and finalizers in your case may at best be used as a backup mechanism to cover up after sloppy coding to some degreee. This means, unfortunately, your idea of making the API nicer by using finalizers to clean up OpenGL resources probably won't work.

Does the JVM create a mutex for every object in order to implement the 'synchronized' keyword? If not, how?

As a C++ programmer becoming more familiar with Java, it's a little odd to me to see language level support for locking on arbitrary objects without any kind of declaration that the object supports such locking. Creating mutexes for every object seems like a heavy cost to be automatically opted into. Besides memory usage, mutexes are an OS limited resource on some platforms. You could spin lock if mutexes aren't available but the performance characteristics of that are significantly different, which I would expect to hurt predictability.
Is the JVM smart enough in all cases to recognize that a particular object will never be the target of the synchronized keyword and thus avoid creating the mutex? The mutexes could be created lazily, but that poses a bootstrapping problem that itself necessitates a mutex, and even if that were worked around I assume there's still going to be some overhead for tracking whether a mutex has already been created or not. So I assume if such an optimization is possible, it must be done at compile time or startup. In C++ such an optimization would not be possible due to the compilation model (you couldn't know if the lock for an object was going to be used across library boundaries), but I don't know enough about Java's compilation and linking to know if the same limitations apply.
Speaking as someone who has looked at the way that some JVMs implement locks ...
The normal approach is to start out with a couple of reserved bits in the object's header word. If the object is never locked, or if it is locked but there is no contention it stays that way. If and when contention occurs on a locked object, the JVM inflates the lock into a full-blown mutex data structure, and it stays that way for the lifetime of the object.
EDIT - I just noticed that the OP was talking about OS-supported mutexes. In the examples that I've looked at, the uninflated mutexes were implemented directly using CAS instructions and the like, rather than using pthread library functions, etc.
This is really an implementation detail of the JVM, and different JVMs may implement it differently. However, it is definitely not something that can be optimized at compile time, since Java links at runtime, and this it is possible for previously unknown code to get a hold of an object created in older code and start synchronizing on it.
Note that in Java lingo, the synchronization primitive is called "monitor" rather than mutex, and it is supported by special bytecode operations. There's a rather detailed explanation here.
You can never be sure that an object will never be used as a lock (consider reflection). Typically every object has a header with some bits dedicated to the lock. It is possible to implement it such that the header is only added as needed, but that gets a bit complicated and you probably need some header anyway (class (equivalent of "vtbl" and allocation size in C++), hash code and garbage collection).
Here's a wiki page on the implementation of synchronisation in the OpenJDK.
(In my opinion, adding a lock to every object was a mistake.)
can't JVM use compare-and-swap instruction directly? let's say each object has a field lockingThreadId storing the id of the thread that is locking it,
while( compare_and_swap (obj.lockingThreadId, null, thisThreadId) != thisTheadId )
// failed, someone else got it
mark this thread as waiting on obj.
shelf this thead
//out of loop. now this thread locked the object
do the work
obj.lockingThreadId = null;
wake up threads waiting on the obj
this is a toy model, but it doesn't seem too expensive, and does no rely on OS.

Why do you not explicitly call finalize() or start the garbage collector?

After reading this question, I was reminded of when I was taught Java and told never to call finalize() or run the garbage collector because "it's a big black box that you never need to worry about". Can someone boil the reasoning for this down to a few sentences? I'm sure I could read a technical report from Sun on this matter, but I think a nice, short, simple answer would satisfy my curiosity.
The short answer: Java garbage collection is a very finely tuned tool. System.gc() is a sledge-hammer.
Java's heap is divided into different generations, each of which is collected using a different strategy. If you attach a profiler to a healthy app, you'll see that it very rarely has to run the most expensive kinds of collections because most objects are caught by the faster copying collector in the young generation.
Calling System.gc() directly, while technically not guaranteed to do anything, in practice will trigger an expensive, stop-the-world full heap collection. This is almost always the wrong thing to do. You think you're saving resources, but you're actually wasting them for no good reason, forcing Java to recheck all your live objects “just in case”.
If you are having problems with GC pauses during critical moments, you're better off configuring the JVM to use the concurrent mark/sweep collector, which was designed specifically to minimise time spent paused, than trying to take a sledgehammer to the problem and just breaking it further.
The Sun document you were thinking of is here: Java SE 6 HotSpot™ Virtual Machine Garbage Collection Tuning
(Another thing you might not know: implementing a finalize() method on your object makes garbage collection slower. Firstly, it will take two GC runs to collect the object: one to run finalize() and the next to ensure that the object wasn't resurrected during finalization. Secondly, objects with finalize() methods have to be treated as special cases by the GC because they have to be collected individually, they can't just be thrown away in bulk.)
Don't bother with finalizers.
Switch to incremental garbage collection.
If you want to help the garbage collector, null off references to objects you no longer need. Less path to follow= more explicitly garbage.
Don't forget that (non-static) inner class instances keep references to their parent class instance. So an inner class thread keeps a lot more baggage than you might expect.
In a very related vein, if you're using serialization, and you've serialized temporary objects, you're going to need to clear the serialization caches, by calling ObjectOutputStream.reset() or your process will leak memory and eventually die.
Downside is that non-transient objects are going to get re-serialized.
Serializing temporary result objects can be a bit more messy than you might think!
Consider using soft references. If you don't know what soft references are, have a read of the javadoc for java.lang.ref.SoftReference
Steer clear of Phantom references and Weak references unless you really get excitable.
Finally, if you really can't tolerate the GC use Realtime Java.
No, I'm not joking.
The reference implementation is free to download and Peter Dibbles book from SUN is really good reading.
As far as finalizers go:
They are virtually useless. They aren't guaranteed to be called in a timely fashion, or indeed, at all (if the GC never runs, neither will any finalizers). This means you generally shouldn't rely on them.
Finalizers are not guaranteed to be idempotent. The garbage collector takes great care to guarantee that it will never call finalize() more than once on the same object. With well-written objects, it won't matter, but with poorly written objects, calling finalize multiple times can cause problems (e.g. double release of a native resource ... crash).
Every object that has a finalize() method should also provide a close() (or similar) method. This is the function you should be calling. e.g., FileInputStream.close(). There's no reason to be calling finalize() when you have a more appropriate method that is intended to be called by you.
Assuming finalizers are similar to their .NET namesake then you only really need to call these when you have resources such as file handles that can leak. Most of the time your objects don't have these references so they don't need to be called.
It's bad to try to collect the garbage because it's not really your garbage. You have told the VM to allocate some memory when you created objects, and the garbage collector is hiding information about those objects. Internally the GC is performing optimisations on the memory allocations it makes. When you manually try to collect the garbage you have no knowledge about what the GC wants to hold onto and get rid of, you are just forcing it's hand. As a result you mess up internal calculations.
If you knew more about what the GC was holding internally then you might be able to make more informed decisions, but then you've missed the benefits of GC.
The real problem with closing OS handles in finalize is that the finalize are executed in no guaranteed order. But if you have handles to the things that block (think e.g. sockets) potentially your code can get into deadlock situation (not trivial at all).
So I'm for explicitly closing handles in a predictable orderly manner. Basically code for dealing with resources should follow the pattern:
SomeStream s = null;
...
try{
s = openStream();
....
s.io();
...
} finally {
if (s != null) {
s.close();
s = null;
}
}
It gets even more complicated if you write your own classes that work via JNI and open handles. You need to make sure handles are closed (released) and that it will happen only once. Frequently overlooked OS handle in Desktop J2SE is Graphics[2D]. Even BufferedImage.getGrpahics() can potentially return you the handle that points into a video driver (actually holding the resource on GPU). If you won't release it yourself and leave it garbage collector to do the work - you may find strange OutOfMemory and alike situation when you ran out of video card mapped bitmaps but still have plenty of memory. In my experience it happens rather frequently in tight loops working with graphics objects (extracting thumbnails, scaling, sharpening you name it).
Basically GC does not take care of programmers responsibility of correct resource management. It only takes care of memory and nothing else. The Stream.finalize calling close() IMHO would be better implemented throwing exception new RuntimeError("garbage collecting the stream that is still open"). It will save hours and days of debugging and cleaning code after the sloppy amateurs left the ends lose.
Happy coding.
Peace.
The GC does a lot of optimization on when to properly finalize things.
So unless you're familiar with how the GC actually works and how it tags generations, manually calling finalize or start GC'ing will probably hurt performance than help.
Avoid finalizers. There is no guarantee that they will be called in a timely fashion. It could take quite a long time before the Memory Management system (i.e., the garbage collector) decides to collect an object with a finalizer.
Many people use finalizers to do things like close socket connections or delete temporary files. By doing so you make your application behaviour unpredictable and tied to when the JVM is going to GC your object. This can lead to "out of memory" scenarios, not due to the Java Heap being exhausted, but rather due to the system running out of handles for a particular resource.
One other thing to keep in mind is that introducing the calls to System.gc() or such hammers may show good results in your environment, but they won't necessarily translate to other systems. Not everyone runs the same JVM, there are many, SUN, IBM J9, BEA JRockit, Harmony, OpenJDK, etc... This JVM all conform to the JCK (those that have been officially tested that is), but have a lot of freedom when it comes to making things fast. GC is one of those areas that everyone invests in heavily. Using a hammer will often times destroy that effort.

Categories

Resources