I was browsing some old books and found a copy of "Practical Java" by Peter Hagger. In the performance section, there is a recommendation to set object references to null when no longer needed.
In Java, does setting object references to null improve performance or garbage collection efficiency? If so, in what cases is this an issue? Container classes? Object composition? Anonymous inner classes?
I see this in code pretty often. Is this now obsolete programming advice or is it still useful?
It depends a bit on when you were thinking of nulling the reference.
If you have an object chain A->B->C, then once A is not reachable, A, B and C will all be eligible for garbage collection (assuming nothing else is referring to either B or C). There's no need, and never has been any need, to explicitly set references A->B or B->C to null, for example.
Apart from that, most of the time the issue doesn't really arise, because in reality you're dealing with objects in collections. You should generally always be thinking of removing objects from lists, maps etc by calling the appropiate remove() method.
The case where there used to be some advice to set references to null was specifically in a long scope where a memory-intensive object ceased to be used partway through the scope. For example:
{
BigObject obj = ...
doSomethingWith(obj);
obj = null; <-- explicitly set to null
doSomethingElse();
}
The rationale here was that because obj is still in scope, then without the explicit nulling of the reference, it does not become garbage collectable until after the doSomethingElse() method completes. And this is the advice that probably no longer holds on modern JVMs: it turns out that the JIT compiler can work out at what point a given local object reference is no longer used.
No, it's not obsolete advice. Dangling references are still a problem, especially if you're, say, implementing an expandable array container (ArrayList or the like) using a pre-allocated array. Elements beyond the "logical" size of the list should be nulled out, or else they won't be freed.
See Effective Java 2nd ed, Item 6: Eliminate Obsolete Object References.
Instance fields, array elements
If there is a reference to an object, it cannot be garbage collected. Especially if that object (and the whole graph behind it) is big, there is only one reference that is stopping garbage collection, and that reference is not really needed anymore, that is an unfortunate situation.
Pathological cases are the object that retains an unnessary instance to the whole XML DOM tree that was used to configure it, the MBean that was not unregistered, or the single reference to an object from an undeployed web application that prevents a whole classloader from being unloaded.
So unless you are sure that the object that holds the reference itself will be garbage collected anyway (or even then), you should null out everything that you no longer need.
Scoped variables:
If you are considering setting a local variable to null before the end of its scope , so that it can be reclaimed by the garbage collector and to mark it as "unusable from now on", you should consider putting it in a more limited scope instead.
{
BigObject obj = ...
doSomethingWith(obj);
obj = null; // <-- explicitly set to null
doSomethingElse();
}
becomes
{
{
BigObject obj = ...
doSomethingWith(obj);
} // <-- obj goes out of scope
doSomethingElse();
}
Long, flat scopes are generally bad for legibility of the code, too. Introducing private methods to break things up just for that purpose is not unheard of, too.
In memory restrictive environments (e.g. cellphones) this can be useful. By setting null, the objetc don't need to wait the variable to get out of scope to be gc'd.
For the everyday programming, however, this shouldn't be the rule, except in special cases like the one Chris Jester-Young cited.
Firstly, It does not mean anything that you are setting a object to null. I explain it below:
List list1 = new ArrayList();
List list2 = list1;
In above code segment we are creating the object reference variable name list1 of ArrayList object that is stored in the memory. So list1 is referring that object and it nothing more than a variable. And in the second line of code we are copying the reference of list1 to list2. So now going back to your question if I do:
list1 = null;
that means list1 is no longer referring any object that is stored in the memory so list2 will also having nothing to refer. So if you check the size of list2:
list2.size(); //it gives you 0
So here the concept of garbage collector arrives which says «you nothing to worry about freeing the memory that is hold by the object, I will do that when I find that it will no longer used in program and JVM will manage me.»
I hope it clear the concept.
One of the reasons to do so is to eliminate obsolete object references.
You can read the text here.
Related
You have one big object in java. it has got 4 or five references. you don't know all those references. At time on deletion you know only one reference and you want to delete that object completely. How to achieve that? and also if you want to know other references then to what is the best way to do that.
It is not in our hand.. You can just nullify it from your end..
Object a = new Object();
a = null; // after that, if there is no live thread which is accessing members,it will be deleted by garbage collector
You could try Finalize() or System.runFinalization() but frankly, if there are references still pointing to the object, then I think the GC will ignore your request.
It is not possible in Java.
If you have strong reference referring your object, you cannot force JVM to GC that object. It simply cannot guarantee the program will work.
If codes of all other references are in your control, consider changing them to use WeakReference or SoftReference
There are some things that are not in our hands and its better to leave it to the JRE to handle it. All we can do that we make sure that the we make them null explicitly after using them.
{
// Some block
HugeObject obj = HugeObject.getInstance();
// Use it
obj = null;
}
// end of block
Java memory handling is just built to prevent that. An object is guaranteed to live as long as a reference to this object exists. As far as I know there is no (official) way to get to know the other references to an object (and there should be no need for that).
In Java GC(Garbage collector) handles heap cleanup. If an Object has no live references to it then it will automatically be cleaned up. So you need to make sure there are no live references to the Object.
Making it null is one of the way. But it will not guarantee it's cleanup if there is some other Object pointing to the same reference. That is why writing good code involves closing all the resources after use which includes making it to null.
If you are running low on heap you can try increasing heap size or calling System.gc() but again calling gc manually does not guarantee gc will actually be performed. it depends on lot of parameters which are JVM dependent.
What kind of references are these to the object? Are these references created by you and at runtime you don't keep track of of those references. If this is the case, you can wrap your references to the object in soft/ weak reference and then explicitly run the GC request. Otherwise, on runtime, if any live thread has access to the object. GC shall not delete that object.
It is hard to answer no knowing your use case, but if there is one location that you want to be able to remove it from then you can store every other reference to it as a WeakReference. Java normally uses strong refrences when referencing objects and the GC will only clear something when it has no more strong references. However, if you use WeakRefrences and your strong refrence ever goes out of scope there is no guarantee that your data will remain even if it is still needed.
I could be mistaken about this though, as I haven't used this class in a year or two.
On WeakReferences:
http://docs.oracle.com/javase/7/docs/api/java/lang/ref/WeakReference.html
You can declare your objects as WeakReference and add them in ReferenceQueue. In this way , whenever your object will not be further referenced , it will be liable for GC.
/**
Initialize the reference queue , even if you don't do it , no problem . Default reference queue will be taken.
**/
ReferenceQueue<? super Object> testReferenceQueue = new ReferenceQueue<Object>();
Map<String,String> demoHashMap = new HashMap<String,String>();
demoHashMap.put("SomeValue","testValue");
// Declare the object as weak object and put it in reference queue
WeakReference<?> weakObject = new WeakReference<Object>(demoHashMap,testReferenceQueue );
demoHashMap.clear();
demoHashMap = null; // This object is not referenced from anywhere
if(weakObject!=null){
System.out.println("Object is not GCd yet");
}else{
System.out.println("It is already garbage collected");
}
Lets assume, there is a Tree object, with a root TreeNode object, and each TreeNode has leftNode and rightNode objects (e.g a BinaryTree object)
If i call:
myTree = null;
what really happens with the related TreeNode objects inside the tree? Will be garbage collected as well, or i have to set null all the related objects inside the tree object??
Garbage collection in Java is performed on the basis of "reachability". The JLS defines the term as follows:
"A reachable object is any object that can be accessed in any potential continuing computation from any live thread."
So long as an object is reachable1, it is not eligible for garbage collection.
The JLS leaves it up to the Java implementation to figure out how to determine whether an object could be accessible. If the implementation cannot be sure, it is free to treat a theoretically unreachable object as reachable ... and not collect it. (Indeed, the JLS allows an implementation to not collect anything, ever! No practical implementation would do that though2.)
In practice, (conservative) reachability is calculated by tracing; looking at what can be reached by following references starting with the class (static) variables, and local variables on thread stacks.
Here's what this means for your question:
If i call: myTree = null; what really happens with the related TreeNode objects inside the tree? Will be garbage collected as well, or i have to set null all the related objects inside the tree object??
Let's assume that myTree contains the last remaining reachable reference to the tree root.
Nothing happens immediately.
If the internal nodes were previously only reachable via the root node, then they are now unreachable, and eligible for garbage collection. (In this case, assigning null to references to internal nodes is unnecessary.)
However, if the internal nodes were reachable via other paths, they are presumably still reachable, and therefore NOT eligible for garbage collection. (In this case, assigning null to references to internal nodes is a mistake. You are dismantling a data structure that something else might later try to use.)
If myTree does not contain the last remaining reachable reference to the tree root, then nulling the internal reference is a mistake for the same reason as in 3. above.
So when should you null things to help the garbage collector?
The cases where you need to worry are when you can figure out that that the reference in some cell (local, instance or class variable, or array element) won't be used again, but the compiler and runtime can't! The cases fall into roughly three categories:
Object references in class variables ... which (by definition) never go out of scope.
Object references in local variables that are still in scope ... but won't be used. For example:
public List<Pig> pigSquadron(boolean pigsMightFly) {
List<Pig> airbornePigs = new ArrayList<Pig>();
while (...) {
Pig piggy = new Pig();
...
if (pigsMightFly) {
airbornePigs.add(piggy);
}
...
}
return airbornePigs.size() > 0 ? airbornePigs : null;
}
In the above, we know that if pigsMightFly is false, that the list object won't be used. But no mainstream Java compiler could be expected to figure this out.
Object references in instance variables or in array cells where the data structure invariants mean that they won't be used. #edalorzo's stack example is an example of this.
It should be noted that the compiler / runtime can sometimes figure out that an in-scope variable is effectively dead. For example:
public void method(...) {
Object o = ...
Object p = ...
while (...) {
// Do things to 'o' and 'p'
}
// No further references to 'o'
// Do lots more things to 'p'
}
Some Java compilers / runtimes may be able to detect that 'o' is not needed after the loop ends, and treat the variable as dead.
1 - In fact, what we are talking about here is strong reachability. The GC reachability model is more complicated when you consider soft, weak and phantom references. However, these are not relevant to the OP's use-case.
2 - In Java 11 there is an experimental GC called the Epsilon GC that explicitly doesn't collect anything.
They will be garbage collected unless you have other references to them (probably manual). If you just have a reference to the tree, then yes, they will be garbage collected.
You can't set an object to null, only a variable which might contain an pointer/reference to this object. The object itself is not affected by this. But if now no paths from any living thread (i.e. local variable of any running method) to your object exist, it will be garbage-collected, if and when the memory is needed. This applies to any objects, also the ones which are referred to from your original tree object.
Note that for local variables you normally not have to set them to null if the method (or block) will finish soon anyway.
myTree is just a reference variable that previously pointed to an object in the heap. Now you are setting that to null. If you don't have any other reference to that object, then that object will be eligible for garbage collection.
To let the garbage collector remove the object myTree just make a call to gc() after you've set it to null
myTree=null;
System.gc();
Note that the object is removed only when there is no other reference pointing to it.
In Java, you do not need to explicitly set objects to null to allow them to be GC'd. Objects are eligible for GC when there are no references to it (ignoring the java.lang.ref.* classes).
An object gets collected when there are no more references to it.
In your case, the nodes referred to directly by the object formally referenced by myTree (the root node) will be collected, and so on.
This of course is not the case if you have outstanding references to nodes outside of the tree. Those will get GC'd once those references go out of scope (along with anything only they refer to)
Can someone explain the difference between the three Reference classes (or post a link to a nice explanation)? SoftReference > WeakReference > PhantomReference, but when would I use each one? Why is there a WeakHashMap but no SoftHashMap or PhantomHashMap?
And if I use the following code...
WeakReference<String> ref = new WeakReference<String>("Hello!");
if (ref != null) { // ref can get collected at any time...
System.gc(); // Let's assume ref gets collected here.
System.out.println(ref.get()); // Now what?!
}
...what happens? Do I have to check if ref is null before every statement (this is wrong, but what should I do)? Sorry for the rapid-fire questions, but I'm having trouble understanding these Reference classes... Thanks!
The Java library documentation for the java.lang.ref package characterizes the decreasing strength of the three explicit reference types.
You use a SoftReference when you want the referenced object to stay alive until the host process is running low on memory. The object will not be eligible for collection until the collector needs to free memory. Loosely stated, binding a SoftReference means, "Pin the object until you can't anymore."
By contrast, use a WeakReference when you don't want to influence the referenced object's lifetime; you merely want to make a separate assertion about the referenced object, so long as it remains alive. The object's eligibility for collection is not influenced by the presence of bound WeakReferences. Something like an external mapping from object instance to related property, where the property need only be recorded so long as the related object is alive, is a good use for WeakReferences and WeakHashMap.
The last one—PhantomReference—is harder to characterize. Like WeakReference, such a bound PhantomReference exerts no influence on the referenced object's lifetime. But unlike the other reference types, one can't even dereference a PhantomReference. In a sense, it doesn't point to the thing it points to, as far as callers can tell. It merely allows one to associate some related data with the referenced object—data that can later be inspected and acted upon when the PhantomReference gets queued in its related ReferenceQueue. Normally one derives a type from PhantomReference and includes some additional data in that derived type. Unfortunately, there's some downcasting involved to make use of such a derived type.
In your example code, it's not the ref reference (or, if you prefer, "variable") that can be null. Rather, it's the value obtained by calling Reference#get() that may be null. If it is found to be null, you're too late; the referenced object is already on its way to being collected:
final String val = ref.get();
if (null != val)
{
// "val" is now pinned strongly.
}
else
{
// "val" is already ready to be collected.
}
A link: https://community.oracle.com/blogs/enicholas/2006/05/04/understanding-weak-references
PhantomHashMap wouldn't work very well as get always returns null for phantom references.
Caches are difficult, so SoftHashMap might not work as well as you might think. However, I believe Google's collection library contains a general reference map implementation.
You should always check that get returns non-null. (Note, that not checking that the Reference reference itself is not-null.) In the case of interned strings it always will, but (as ever) don't try to be "clever" about it.
It should also be mentioned, as stated on the comment by Truong Xuan Tinh, here: http://blog.yohanliyanage.com/2010/10/ktjs-3-soft-weak-phantom-references/
That JRockit JVM implements weak/soft/phantom references differently than Sun JVM.
String str = new String("hello, world");
WeakReference<String> ref = new WeakReference<String>(str);
str = null;
if (ref != null) {
System.gc();
System.out.println(ref.get());
}
In this case, it will output null. The call to System.gc() is important here.
Help me settle a dispute with a coworker:
Does setting a variable or collection to null in Java aid in garbage collection and reducing memory usage? If I have a long running program and each function may be iteratively called (potentially thousands of times): Does setting all the variables in it to null before returning a value to the parent function help reduce heap size/memory usage?
That's old performance lore. It was true back in 1.0 days, but the compiler and the JVM have been improved to eliminate the need (if ever there was one). This excellent IBM article gets into the details if you're interested: Java theory and practice: Garbage collection and performance
From the article:
There is one case where the use of explicit nulling is not only helpful, but virtually required, and that is where a reference to an object is scoped more broadly than it is used or considered valid by the program's specification. This includes cases such as using a static or instance field to store a reference to a temporary buffer, rather than a local variable, or using an array to store references that may remain reachable by the runtime but not by the implied semantics of the program.
Translation: "explicitly null" persistent objects that are no longer needed. (If you want. "Virtually required" too strong a statement?)
The Java VM Spec
12.6.1 Implementing Finalization
Every object can be characterized by two attributes: it may be reachable, finalizer-reachable, or unreachable, and it may also be unfinalized, finalizable, or finalized.
A reachable object is any object that can be accessed in any potential continuing computation from any live thread. Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
Discussion
Another example of this occurs if the values in an object's fields are stored in registers. The program may then access the registers instead of the object, and never access the object again. This would imply that the object is garbage.
The object is reachable if it can be involved in any potential continuing computation. So if your code refers to a local variable, and nothing else refers to it, then you might cause the object to be collected by setting it to null. This would either give a null pointer exception, or change the behaviour of your program, or if it does neither you didn't need the variable in the first place.
If you are nulling out a field or an array element, then that can possibly make sense for some applications, and it will cause the memory to be reclaimed faster. Once case is creating a large array to replace an existing array referenced by a field in a class - if the field in nulled before the replacement is created, then it may relieve pressure on the memory.
Another interesting feature of Java is that scope doesn't appear in class files, so scope is not relevant to reachability; these two methods create the same bytecode, and hence the VM does not see the scope of the created object at all:
static void withBlock () {
int x = 1;
{
Object a = new Object();
}
System.out.println(x+1);
}
static void withoutBlock () {
int x = 1;
Object a = new Object();
System.out.println(x+1);
}
Not necessarily. An object becomes eligible for garbage collection when there are no live threads anymore that hold a reference to the object.
Local variables go out of scope when the method returns and it makes no sense at all to set local variables to null - the variables disappear anyway, and if there's nothing else that holds a reference the objects that the variables referred to, then those objects become eligible for garbage collection.
The key is not to look at just variables, but look at the objects that those variables refer to, and find out where those objects are referenced by your program.
It is useless on local variables, but it can be useful/needed to clear up instance variables that are not required anymore (e.g. post-initialization).
(Yeah yeah, I know how to apply the Builder pattern...)
That could only make some sense in some scenario like this:
public void myHeavyMethod() {
List hugeList = loadHugeListOfStuff(); // lots of memory used
ResultX res = processHugeList(hugeList); // compute some result or summary
// hugeList = null; // we are done with hugeList
...
// do a lot of other things that takes a LOT of time (seconds?)
// and which do not require hugeList
...
}
Here it could make some benefit to uncomment the hugeList = null line, I guess.
But it would certainly make more sense to rewrite the method (perhaps refactoring into two,
or specifying an inner scope).
Setting an object reference to null only makes it eligible for garbage collection.
It does not necessarily free up the memory,which depends on when the garbage collector runs(which depends on JVM).
When the garbage collector runs,it frees up the heap by deleting only the objects which are eligible for garbage collection.
It is a good to have. When you set objects to null, there is a possibility that the object can be garbage collected faster, in the immediate GC cycle. But there is no guaranteed mechanism to make an object garbage collected at a given time.
In what situations in java is explicit nulling useful. Does it in any way assist the garbage collector by making objects unreachable or something? Is it considered to be a good practice?
In Java it can help if you've got a very long-running method, and the only reference to the an object is via a local variable. Setting that local variable to null when you don't need it any more (but when the method is going to continue to run for a long time) can help the GC. (In C# this is very rarely useful as the GC takes "last possible use" into account. That optimization may make it to Java some time - I don't know.)
Likewise if you've got a member field referring to an object and you no longer need it, you could potentially aid GC by setting the field to null.
In my experience, however, it's rarely actually useful to do either of these things, and it makes the code messier. Very few methods really run for a long time, and setting a variable to null really has nothing to do with what you want the method to achieve. It's not good practice to do it when you don't need to, and if you do need to you should see whether refactoring could improve your design in the first place. (It's possible that your method or type is doing too much.)
Note that setting the variable to null is entirely passive - it doesn't inform the garbage collector that the object can be collected, it just avoids the garbage collector seeing that reference as a reason to keep the object alive next time it (the GC) runs.
In general it isn't needed (of course that can depend on the VM implementation). However if you have something like this:
private static final Map<String, String> foo;
and then have items in the map that you no longer need they will not be eligible for garbage collection so you would need to explicitly remove them. There are many cases like this (event listeners is another area that this can happen with).
But doing something like this:
void foo()
{
Object o;
// use o
o = null; // don't bother doing this, it isn't going to help
}
Edit (forgot to mention this):
If you work at it, you should find that 90-95% of the variables you declare can be made final. A final variable cannot change what it points at (or what its value is for primitives). In most cases where a variable is final it would be a mistake (bug) for it to receive a different value while the method is executing.
If you want to be able to set the variable to null after use it cannot be final, which means that you have a greater chance to create bugs in the code.
One special case I found it useful is when you have a very large object, and want to replace it with another large object. For example, look at the following code:
BigObject bigObject = new BigObject();
// ...
bigObject = new BigObject(); // line 3
If an instance of BigObject is so large that you can have only one such instance in the heap, line 3 will fail with OutOfMemoryError, because the 1st instance cannot be freed until the assignment instruction in line 3 completes, which is obviously after the 2nd instance is ready.
Now, if you set bigObject to null right before line 3:
bigObject = null;
bigObject = new BigObject(); // line 3
the 1st instance can be freed when JVM runs out of heap during the construction of the 2nd instance.
From "Effective Java" : use it to eliminate obsolete object references. Otherwise it can lead to memory leaks which can be very hard to debug.
public Object pop(){
if(size == 0)
throw new EmptyStatckException();
Object result = elements[--size];
elements[size] = null; //Eliminate Object reference
return result;
}
If you are nulling an object that is about to go out of scope anyway when your method block closes, then there is no benefit whatsoever in terms of garbage collection. It is not unusual to encounter people who don't understand this who work really hard to set a lot of things to null needlessly.
Explicit nulling can help with GC in some rare situations where all of the following are true:
The variable is the only (non-weak) reference to the object
You can guarantee that the object will no longer be needed
The variable will stay in scope for an extended period of time (e.g. it is a field in a long-lived object instance)
The compiler is unable to prove that the object is no longer used, but you are able to guarantee this though your superior logical analysis of the code :-)
In practice this is quite rare in good code: if the object is no longer needed, you should normally be declaring it in a narrower scope anyway. For example, if you only need the object during a single invocation of a method, it should be a local variable, not a field in the enclosing object.
One situation where explicit nulling is genuinely useful: if null is used to indicate a specific state then setting to a null value is sometimes going to be necessary and useful. Null is a useful value in itself for a couple of reasons:
Null checks are extremely fast, so conditional code that checks for null is typically more efficient than many alternatives (e.g. calling object.equals())
You get an immediate NullPointerException if you try to dereference it. This is useful because it is good Fail Fast coding style that will help you to catch logic errors.
See also WeakReference in J2SE.