garbage collection of invisible variables

garbage collection of invisible variables - java

I have following code:
void method() {
Object o1 = new Object();
{
Object o2 = new Object();
System.out.println(o2);
}
// any long operation
}
will o2 object be eligible for garbage collection during execution of long operation?

The JLS definition of reachability is:
"A reachable object is any object that can be accessed in any potential continuing
computation from any live thread."
In this case, the reference ceases to be theoretically accessible to ongoing combutations before the println call returns. (I'm assuming that println(o2) doesn't save its the reference somewhere.)
However, in practice no JVMs in existence can tell that the Object becomes unreachable during the call, and most JVMs will only notice this when ... or after ... o2 goes out of scope. And even then, a GC run is not guaranteed to remove the object.
Note: that doesn't contradict the JLS, because the "reachable object" test is really telling you when the object won't be garbage collected, not when it will be. The JLS is careful to specify that an object may be finalized and garbage collected at some point after it becomes unreachable, but that it also may never be finalized and garbage collected.

yes however this will depend on whether the JVM/JIT won't optimize this to avoid superfluous stack operations
which would make it
Object o1 = new Object();
Object o2 = new Object();
System.out.println(o2);
// any long operation
many compilers will group all local variable needed and figure out the maximum needed space to keep them all (and some will be eliminated and just kept in registers) and grow the stack accordingly and only shrink it after the function can return
this would mean that o2 will remain in "accessible" memory according to GC unless it was overwritten with another variable in another scope

You need to understand that your variable o2 and the OBJECT DESIGNATED BY o2 are different.
The variable o2 is actually a pointer (though Java prefers to call them "references") and occupies 4 or 8 bytes in the automatic stack frame. This storage is not garbage collected and only goes away when you return from the procedure (or possibly when you exit the {} brackets depending on compiler implementation).
The object "designated by" (pointed to by) o2 is essentially available for possible garbage collection as soon as the new Object() operation ends, and the existence of a pointer to it in o2 is all that prevents this. Once the variable o2 either no longer exists in a stack frame or has a different pointer value stored into it then the object is eligible to be collected.
So in your particular case the answer is "maybe". It depends on how the compiler and JIT handle the {}, along with a few "luck" issues as to whether, having exited the {} block (but not the method as a whole), the storage location for o2 is reused for something else.

No. Even though the object referenced by o2 is not reachable, it will not be garbage-collected. It is in a state between reachable and unreachable called "invisible" because the reference variable o2 is still on the stack.
To make the object garbage-collectible, assign o2 = null or put that block in another function.
source: A 2001 book on Java performance

Related

Java WeakReference is still holding valid reference when referent is no longer valid

I'm confused with this program:
class Point {
private final int x;
private final int y;
}
public class App
{
WeakReference<Point> newPoint() {
Point referent = new Point();
return new WeakReference<Point>(referent); // after return, stack parameter referent is invalid.
}
public static void main( String[] args ) {
App a = new App();
WeakReference<Point> wp = a.newPoint(); // wp is hold valid or invalid reference?
System.out.println(wp.get()); // not null
}
}
I knew that if weak reference is pointing to an object that's no longer alive, its get() should return null. But in my code, seems its still alive.
Where did I get wrong?

I knew that if weak reference is pointing to an object that's no longer alive, its get() should return null. But in my code, seems its still alive.
Your understanding is imprecise, especially where it relies on the idea of aliveness. Reference objects in general and WeakReference objects in particular are not directly concerned with any of the senses of aliveness that I recognize. Rather, they are concerned with reachability.
The API docs for java.lang.Reference#get() (which is not overridden by WeakReference) say this:
Returns this reference object's referent. If this reference object has been cleared, either by the program or by the garbage collector, then this method returns null.
Note well that the condition for get() returning null is rather specific: the reference object has been cleared. This is achieved for a given instance by invoking that instance's clear() method. As the doc indicates, this may be done by the garbage collector or by the application.
Among the key differences between Reference subclasses is the conditions under which the garbage collector will perform such clearing. For WeakReferences, the API docs say:
Suppose that the garbage collector determines at a certain point in time that an object is weakly reachable. At that time it will atomically clear all weak references to that object [...].
Thus, until the garbage collector determines that a given object is (only) weakly reachable, it will not clear weak references to that object. The garbage collector probably does not run at all during the brief run of your small program, and if it did run, it would be surprising for it to be timed correctly to observe the Point in question to be weakly reachable before the reference's get() method is invoked.
You could try forcing a GC run by invoking System.gc() at the appropriate place. I anticipate that doing so will result in the weak reference being cleared. That would be for demonstrative purposes only, however. Generally speaking, you should rely on Java to perform GC when appropriate, not force it.

When does object go out of scope if no variable is assigned?

When does the object of type list, occupying memory, become eligible for garbage collection, Also where is the variable that holds reference to the list ? In the case of code below, there was no variable assigned to it.
CASE 1:
for (Integer i : returnList()) {
System.out.println(i);
}
In case of a code like:
CASE 2:
List list = returnList();
for (Integer i : list) {
System.out.println(i);
}
list = null;
We can take control of GC, Is there any ways to take care of that in the first case when no variable was assigned ?
To summarize:
What is the mechanism of referrence, without a reference variable to list is case 1?
Does list get eligible for GC'd when stack frame is popped ?
Any way to speed up the eligibility for GC'ing ?

What is the mechanism of referrence, without a reference variable to list is case 1?
There is an implicit reference to the list. This can be seen by understanding that enhanced for like that is translated into:
for(Iterator e = returnList().iterator(); e.hasNext(); ) {
Integer i = (Integer)e.Next();
System.out.println(i);
}
Here, e has a reference to an iterator on returnList, which itself has a reference to returnList. Thus, returnList is rooted as long as e is rooted which is only true while control is in the for loop. When control leaves the for body, e is eligible for collection, so returnList is eligible for collection.
Of course, all of this is assuming that
The owner of returnList isn't maintaining a reference to its return value.
The same list hasn't been returned to another caller and that other caller isn't maintaining a reference to the same list.
Does list get GC'd when stack frame is popped ?
Not necessarily. It will be eligible for collection when the JVM can determine that the referrent has no rooted references to it. Note that it does not necessarily immediately get collected.
Any way to speed up GC in case 1.
It can't be collected any sooner than control leaving the for loop. It might be collected after control leaves the for loop. Let the JVM worry about this.
Note that you can attempt a manual garbage collection via
System.gc();
but note that this might exhibit worse behavior because if it triggers a garbage collection, it might be a full garbage collection. Note that the JVM can ignore this request. You might be wasting a lot of CPU cycles here. Note that on a system with infinite memory, the garbage collector never needs to run. On such a system, requesting the garbage collector could be a complete waste of CPU cycles if the garbage collector obeys your request.
Let the JVM manage the garbage collections. The algorithms for it are highly tuned.

From SCJP page 256-257
The garbage collector does some magical, unknown operations, and when
it discovers an object that can't be reached by any live thread,it
will consider that object as eligible for deletion, and it might even
delete it at some point. (You guessed it; it also might not ever
delete it.) When we talk about reaching an object, we're really
talking about having a reachable reference variable that refers to the
object in question. If our Java program has a reference variable that
refers to an object, and that reference variable is available to a
live thread, then that object is considered reachable.
Setting an object to NULL might help to fasten when the GC deletes the object but it might as well not, you don't have control over that. The JVM will run GC when it senses that memory is running low and you can manually ask it to do so but nothing is guaranteed.

What is the mechanism of reference, without a reference variable to
list is case 1?
see next answer.
Does list get GC'd when stack frame is popped ?
The list is created on the heap - but if the only reference to it was on the stack it is eligible to be collected by GC. That doesn't mean that it'll happen any time soon though.
Any way to speed up GC in case 1.
You can't "speed up" GC, even by calling System.gc(); you're only "suggesting" that the GC can to do its work - again, it won't necessarily happen any time soon.
There's a lot of sense behind it too: say that your program has 2GB of memory to use and is currently using only 2KB - it does not justify GC stopping your program from running and clean the memory only because some objects are eligible for deletion.

You cannot take control of GC.It is managed by JVM. What you can manage is what objects should be available for garbage collection. Although , you can find out when garbage collector will run using finalize method. It is always called before an object is deleted
public class Demo{
static void create()
{
Demo o = new Demo();
}
public void finalize()
{
System.out.println("GC called");
}
public static void main (String ...ar)
{
for (long i=1;i<900000;i++) //Try changing values here
{
create();
}
}
}
Objects created inside the methods are available for GC when method is returned(just like local variables exist for duration of method).However if the method returns an object, it will not be avialble for garbage collection
public class Demo{
public void getDate()
{
Date o = new Date();
StringBuffer d = new StringBuffer(o.toString());
System.out.println(d);
return o;
}
public static void main (String ...ar)
{
Date x= getDate();
}
}
In the above code, object d is available for GC when the method returns. But object o will not be available for collection

What are the chances of getting exactly the same object reference twice

I sometimes assume that if oldObject != newObject then the object has changed - which seems a fair assumption in most cases but is it truly a bad assumption?
In short, under what situation could the following code print "Same!"?
static WeakReference<Object> oldO = null;
...
Object o = new Object();
oldO = new WeakReference(o);
// Do some stuff with o - could take hours or even days to complete.
...
// Discard o (or let it go out of scope).
o = null;
// More stuff - could be hours or days later.
...
o = new Object();
// Later still.
if ( o == oldO.get() ) {
System.out.println("Same!");
}
I realise that this is indeed remotely possible because an object reference is essentially the memory address of the object (or could be in some JVM). But how likely is it? Are we talking decades of run-time before it actually happens?
Added
My apologies - please assume that oldO is some form of weak reference that does not stop it from being collected. Perhaps it is Weak as the code (now) suggests or the reference is store in a database or a file somewhere.

(I'm answering what I think what you really wanted to know, rather than the particular snippet you have)
It's implementation dependant. The contract of object reference is that as long as the object is still alive, no other object will compare == with it. This implies that after the object is garbage collected, the VM is free to reuse the same object reference.
Implementation of Java may choose to use an increasing integer for object reference, in which case you can only get the same object reference when the reference counter overflows back to 0. Other implementation may use memory location, which makes it more likely for the same reference to be reused. In any case, you should define your own object identity if that matters.

never will it be the same. oldO will always reference the initial object so it will never be discarded and new object can't have same address.
UPDATE: seems like answer was updated to specify that oldO is a weak reference. In this case, when the object goes away, oldO's reference will become null. This means it will never match another object in the JVM.

It is not possible for them to be equal. You still have a reference to the old object (oldO), so it will never be discarded.

o == oldO means ois the same memory address as oldO. So, that cannot happen unless, at some time, you are doing either o = oldO or oldO = o. By transitivity, doing foo = o; oldO = foo or anything equivalent will achieve the same result, of course.

Firstly, memory address is irrelevant. Java ain't C. Object identity is a JVM implemention - it may, or may not, rely on memory address, but more likely does not, since the JVM is free to move the object around in memory but must maintain its identity.
But regardless. because you hold a reference to the original object, the second one can not be the "same" object.

It will never happen.
Your first object (oldO) is stored at a specific memory location.
Your second object will systematically be referenced at another memery location, as long as oldO is referenced.
So oldO == o will compare both memory addresses, which will always be different.
If you dereference oldO, it will be garbage collected, and you'll eventuelly be able to create a new object at this same address. But you won't be able to compare it with oldO, because it has been dereferenced.

By having a reference to your old object, you prevent it being garbage collected, so you prevent that bit of memory being available for a new object, so they could never be equal.
I wondered if you used a SoftReference you could keep a reference to the old object, while allowing it to be garbage-collected BUT:
1) I assume once the old object is collected, the SoftReference is set to null and
2) this is artificially trying to force the situation, so doesn't really prove anything :-)

According to the documentation, weak references will be cleared when the object is garbage collected. It does not specify what it means to be "cleared", but presumably it is set to null. If it is in fact set to null, null will never be declared == to any object reference, regardless of its memory location.

Garbage collector in java - set an object null

Lets assume, there is a Tree object, with a root TreeNode object, and each TreeNode has leftNode and rightNode objects (e.g a BinaryTree object)
If i call:
myTree = null;
what really happens with the related TreeNode objects inside the tree? Will be garbage collected as well, or i have to set null all the related objects inside the tree object??

Garbage collection in Java is performed on the basis of "reachability". The JLS defines the term as follows:
"A reachable object is any object that can be accessed in any potential continuing computation from any live thread."
So long as an object is reachable1, it is not eligible for garbage collection.
The JLS leaves it up to the Java implementation to figure out how to determine whether an object could be accessible. If the implementation cannot be sure, it is free to treat a theoretically unreachable object as reachable ... and not collect it. (Indeed, the JLS allows an implementation to not collect anything, ever! No practical implementation would do that though2.)
In practice, (conservative) reachability is calculated by tracing; looking at what can be reached by following references starting with the class (static) variables, and local variables on thread stacks.
Here's what this means for your question:
If i call: myTree = null; what really happens with the related TreeNode objects inside the tree? Will be garbage collected as well, or i have to set null all the related objects inside the tree object??
Let's assume that myTree contains the last remaining reachable reference to the tree root.
Nothing happens immediately.
If the internal nodes were previously only reachable via the root node, then they are now unreachable, and eligible for garbage collection. (In this case, assigning null to references to internal nodes is unnecessary.)
However, if the internal nodes were reachable via other paths, they are presumably still reachable, and therefore NOT eligible for garbage collection. (In this case, assigning null to references to internal nodes is a mistake. You are dismantling a data structure that something else might later try to use.)
If myTree does not contain the last remaining reachable reference to the tree root, then nulling the internal reference is a mistake for the same reason as in 3. above.
So when should you null things to help the garbage collector?
The cases where you need to worry are when you can figure out that that the reference in some cell (local, instance or class variable, or array element) won't be used again, but the compiler and runtime can't! The cases fall into roughly three categories:
Object references in class variables ... which (by definition) never go out of scope.
Object references in local variables that are still in scope ... but won't be used. For example:
public List<Pig> pigSquadron(boolean pigsMightFly) {
List<Pig> airbornePigs = new ArrayList<Pig>();
while (...) {
Pig piggy = new Pig();
...
if (pigsMightFly) {
airbornePigs.add(piggy);
}
...
}
return airbornePigs.size() > 0 ? airbornePigs : null;
}
In the above, we know that if pigsMightFly is false, that the list object won't be used. But no mainstream Java compiler could be expected to figure this out.
Object references in instance variables or in array cells where the data structure invariants mean that they won't be used. #edalorzo's stack example is an example of this.
It should be noted that the compiler / runtime can sometimes figure out that an in-scope variable is effectively dead. For example:
public void method(...) {
Object o = ...
Object p = ...
while (...) {
// Do things to 'o' and 'p'
}
// No further references to 'o'
// Do lots more things to 'p'
}
Some Java compilers / runtimes may be able to detect that 'o' is not needed after the loop ends, and treat the variable as dead.
1 - In fact, what we are talking about here is strong reachability. The GC reachability model is more complicated when you consider soft, weak and phantom references. However, these are not relevant to the OP's use-case.
2 - In Java 11 there is an experimental GC called the Epsilon GC that explicitly doesn't collect anything.

They will be garbage collected unless you have other references to them (probably manual). If you just have a reference to the tree, then yes, they will be garbage collected.

You can't set an object to null, only a variable which might contain an pointer/reference to this object. The object itself is not affected by this. But if now no paths from any living thread (i.e. local variable of any running method) to your object exist, it will be garbage-collected, if and when the memory is needed. This applies to any objects, also the ones which are referred to from your original tree object.
Note that for local variables you normally not have to set them to null if the method (or block) will finish soon anyway.

myTree is just a reference variable that previously pointed to an object in the heap. Now you are setting that to null. If you don't have any other reference to that object, then that object will be eligible for garbage collection.
To let the garbage collector remove the object myTree just make a call to gc() after you've set it to null
myTree=null;
System.gc();
Note that the object is removed only when there is no other reference pointing to it.

In Java, you do not need to explicitly set objects to null to allow them to be GC'd. Objects are eligible for GC when there are no references to it (ignoring the java.lang.ref.* classes).

An object gets collected when there are no more references to it.
In your case, the nodes referred to directly by the object formally referenced by myTree (the root node) will be collected, and so on.
This of course is not the case if you have outstanding references to nodes outside of the tree. Those will get GC'd once those references go out of scope (along with anything only they refer to)

Does variable = null set it for garbage collection

Help me settle a dispute with a coworker:
Does setting a variable or collection to null in Java aid in garbage collection and reducing memory usage? If I have a long running program and each function may be iteratively called (potentially thousands of times): Does setting all the variables in it to null before returning a value to the parent function help reduce heap size/memory usage?

That's old performance lore. It was true back in 1.0 days, but the compiler and the JVM have been improved to eliminate the need (if ever there was one). This excellent IBM article gets into the details if you're interested: Java theory and practice: Garbage collection and performance

From the article:
There is one case where the use of explicit nulling is not only helpful, but virtually required, and that is where a reference to an object is scoped more broadly than it is used or considered valid by the program's specification. This includes cases such as using a static or instance field to store a reference to a temporary buffer, rather than a local variable, or using an array to store references that may remain reachable by the runtime but not by the implied semantics of the program.
Translation: "explicitly null" persistent objects that are no longer needed. (If you want. "Virtually required" too strong a statement?)

The Java VM Spec
12.6.1 Implementing Finalization
Every object can be characterized by two attributes: it may be reachable, finalizer-reachable, or unreachable, and it may also be unfinalized, finalizable, or finalized.
A reachable object is any object that can be accessed in any potential continuing computation from any live thread. Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
Discussion
Another example of this occurs if the values in an object's fields are stored in registers. The program may then access the registers instead of the object, and never access the object again. This would imply that the object is garbage.
The object is reachable if it can be involved in any potential continuing computation. So if your code refers to a local variable, and nothing else refers to it, then you might cause the object to be collected by setting it to null. This would either give a null pointer exception, or change the behaviour of your program, or if it does neither you didn't need the variable in the first place.
If you are nulling out a field or an array element, then that can possibly make sense for some applications, and it will cause the memory to be reclaimed faster. Once case is creating a large array to replace an existing array referenced by a field in a class - if the field in nulled before the replacement is created, then it may relieve pressure on the memory.
Another interesting feature of Java is that scope doesn't appear in class files, so scope is not relevant to reachability; these two methods create the same bytecode, and hence the VM does not see the scope of the created object at all:
static void withBlock () {
int x = 1;
{
Object a = new Object();
}
System.out.println(x+1);
}
static void withoutBlock () {
int x = 1;
Object a = new Object();
System.out.println(x+1);
}

Not necessarily. An object becomes eligible for garbage collection when there are no live threads anymore that hold a reference to the object.
Local variables go out of scope when the method returns and it makes no sense at all to set local variables to null - the variables disappear anyway, and if there's nothing else that holds a reference the objects that the variables referred to, then those objects become eligible for garbage collection.
The key is not to look at just variables, but look at the objects that those variables refer to, and find out where those objects are referenced by your program.

It is useless on local variables, but it can be useful/needed to clear up instance variables that are not required anymore (e.g. post-initialization).
(Yeah yeah, I know how to apply the Builder pattern...)

That could only make some sense in some scenario like this:
public void myHeavyMethod() {
List hugeList = loadHugeListOfStuff(); // lots of memory used
ResultX res = processHugeList(hugeList); // compute some result or summary
// hugeList = null; // we are done with hugeList
...
// do a lot of other things that takes a LOT of time (seconds?)
// and which do not require hugeList
...
}
Here it could make some benefit to uncomment the hugeList = null line, I guess.
But it would certainly make more sense to rewrite the method (perhaps refactoring into two,
or specifying an inner scope).

Setting an object reference to null only makes it eligible for garbage collection.
It does not necessarily free up the memory,which depends on when the garbage collector runs(which depends on JVM).
When the garbage collector runs,it frees up the heap by deleting only the objects which are eligible for garbage collection.

It is a good to have. When you set objects to null, there is a possibility that the object can be garbage collected faster, in the immediate GC cycle. But there is no guaranteed mechanism to make an object garbage collected at a given time.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.