Can Java garbage collect variables before end of scope? - java

Suppose we have a program like this:
void main() {
// Point 0
BigThing bt = new BigThing();
// Point 1
WeakReference<BigThing> weak = new WeakReference<>(bt);
// Point 2
doSomething(weak);
// Point 3
}
void doSomething(...) { ... }
We know that the weak reference to the BigThing object cannot prevent the object from being garbage collected when it becomes no longer strongly reachable.
My question is about the local variable bt which is a strong reference to the BigThing object. Does the object become not-strongly-reachable at point 2 (just before calling doSomething()) or at point 3 (end of block scope)?
The answer to this question will affect whether the call to doSomething() is guaranteed to be able to access the live BigThing object, or whether the underlying object can die during the function call.
I am uncertain because you could argue that after point 2, the local variable bt is never read or written anymore, so the variable is effectively dead and the pointer value can be discarded. This "optimization" would be valid if all references were strong, but the reasoning falls apart when the notions of soft, weak, and phantom references are introduced, and finalizers as well. Also as an analogy, because C++ has destructors, a value must be destructed at the end of the scope, and cannot be moved ahead to the point of last usage.

I would say the object is collectable at point 2, going by the following language in JLS section 12.6.1:
Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
Since the bt variable will no longer be used after point 2, Java is free to clear that variable, rendering the BigThing object only weakly reachable.

Java 9 introduces Reference.reachabilityFence to solve this case, which of course also implies that it does exist in the first place.

Related

Weak Reference and strong Reference

I have a very basic question regarding weak reference and strong reference in Java.
In general Java programming we generally do not create weak reference of object, we create normal strong reference but when we are done with that object we assign null to that object with the conception that, that object will be collected by GC next time.
Is that my understanding is wrong?
After reading some of the articles, it looks like, object is collected by GC If it is null or not referred anywhere if only it has weak reference. I am confused.
In other word what is the difference between these two code snippets, in respect to Java GC?
Snippet 1
Counter counter = new Counter(); // strong reference - line 1
WeakReference<Counter> weakCounter = new WeakReference<Counter> (counter); //weak reference
counter = null;
Snippet 2
Counter counter = new Counter(); // strong reference - line 1
counter = null;
In both cases, counter will be eligible for garbage collection. Even if you use SoftReference, it will be eligible for GC, but it will only be collected reluctantly. (That is, a SoftReference encourages the GC to leave the object in memory, but still allows it to be collected.)
Only hard references force the GC to leave objects alone.
Normally you only need to assign null to a reference if the reference has longer life than you want for the object. Once a hard-reference variable goes out of scope, it is no longer reachable from live code so its hard reference will not prevent the GC from collecting the object.
Note also that there's no guarantee as to when objects eligible for collection will actually be collected by the GC. It may be on the next GC cycle or maybe not. It depends heavily on the implementation of the GC. The only thing you can say for sure is that all eligible objects will be collected before the VM throws an OutOfMemoryError.
The difference is that if I remeber correctly,
The First code snippit assigns the value null to the object counter after it has been assigned to weakCounter. So therfor weakCounter still has the reference to the old counter without the reference to counter being updated. But the counter is still collected by the compiler, even though the weakCounter is assigned to an object reference of Counter
In the second code example, the counter goes from being assigned to a object to a null value letting java know "hey you can collect me in the garbage!"
Hope this made sense and it helped your understanding if some of my facts are wrong please feel free to tell me where I am mistaken :)
The two are essentially equivalent, save for the fact that you might be able to reference the object through the WeakReference if you do so before GC collects it.
The purpose of the WeakReference is so you can have it stashed somewhere (eg, some sort of search index) and not worry about having to clear it if you are done with the object and wish to null any "strong" references (so that the object may be collected and the space reused). If you used an ordinary strong reference you'd have to be sure to clear it or the object would hang around forever.
(SoftReferences, as mentioned by Ted Hopp, are similar in mechanics, except that GC will only collect the referenced objects if storage is tight. This makes them suitable for things like cached internet pages.)

How are local variable kept in Memory?

I would like to know how to place a local variable in memory? In method1, do the variable take a place into memory, one time?
In method2, do the variable take a place after deleting old place in memory, for each time?
public void method1() {
Object obj = null;
for(.....) {
obj = come from other-->
}
}
public void method2() {
for(.....) {
Object obj = come from other-->
}
}
You have a local variables which may be in a register or once in memory.
You also have an object which the local variable references. This will be created on each iteration in both cases.
They are effectively the same, except I would prefer the second case if it is possible as it narrows the scope of the local variable.
Each method call is associated with Activation Record that is stored on a call stack. The activation record holds references to the memory blocks in heap corresponding to the method level variables. Once the method call returns to the caller, this activation record will be removed from the stack and the memory references are potentially available to be garbage-collected.
In your case,
the obj in the first method, it's reference is stored in the call stack and the actual memory is on the heap and this is done once per method call.
the obj in the for loop in the second method is created once for each iteration and goes out of scope at the end of each iteration. So, the reference and the memory on the heap are allocated for each iteration.
The local variables are usually (unless e.g. optimized away) kept on the stack memory. But they can only store primitive values or references. The referenced objects themselves are usually allocated on the heap (withstanding any JIT optimization).
See Stack based memory allocation (Wikipedia) vs. Heap based memory allocation (Wikipedia).
Storing values on the stack is very cheap. Similar to a function call, where you store the return pointer on the stack. It does not require much more than incrementing the stack pointer (and you can imagine that incrementing a dedicated CPU register is fast!)
The object itself is different. Note that theoretically, some java compiler or JIT might be able to optimize your second code better, because you indicate clearly that the value is not needed for the next iteration. (An even better compiler should be able to figure this out itself.)
In general, a modern compiler should produce the same machine code after optimization for both cases. (This may happen in the JIT compiler, so the Java bytecode may still show the difference).
Anyway: do not try to overoptimize by reusing local variables. Instead, write explicit code and let the compiler optimize. By using a fresh variable inside the loop, you make it explicit that it is not reused anywhere. This can prevent some programming errors!
I believe in both cases a new Object is created in memory for every iteration. It is up to the garbage collector to notice that there are no references to any but the most 'recent' Object.
Objects in method1 and method2 will be placed in heap, but java compiler perform Escape analysis for determination we need release this kind of object after method execution or not. Escape analysis is implemented in Java Standard Edition 6

Why is an 'invisible' object not instantly collected?

I just read this article: The Truth About Garbage Collection
In section "A.3.3 Invisible" it is explained how and when an object gets into the invisible state.
In the below code, the object assigned to the variable foo will become invisible after leaving the try/catch block and will remainly strongly referenced until the run method exits (which will never happen, because the while loop runs forever).
public void run() {
try {
Object foo = new Object();
foo.doSomething();
} catch (Exception e) {
// whatever
}
while (true) { // do stuff } // loop forever
}
It is stated in this article:
However, an efficient implementation of the JVM is unlikely to zero
the reference when it goes out of scope.
Why is that not efficient?
My attempt at an explanation is as follows:
Say the stack for this method contains four elements, with the now invisible object being at the bottom.
If you want to collect the object instantly, you would have to pop and store three elements, pop and discard the fourth element and then push the three still valid elements back onto the stack.
If you collect the invisible object after control flow has left the run method, the VM could simply pop all four elements and discard them.
The local variables are not on the operand stack, but in the local variables area in the activation frame, accessed, in the case of references via aload and astore bytecodes and zeroing a local variable does not involve any pushing and popping.
Zeroing is inefficient because it is not needed:
it would not cause an immediate garbage collection cycle
the zero may soon be overwritten by another value as dictated by the logic of the program.
going out of the scope means that the local variable is no longer part of the root set for garbage collection. As such what value it held immediately before going out of scope - zero or a valid reference - is immaterial; it won't be examined anyway.
EDIT:
Some comments on the last statement.
Indeed, at a bytecode level there are no scopes and a local variable slot may remain a part of the root set until the method returns. Of course, a JVM implementation can determine when a local variable slot is dead (i.e. all possible paths to method return either don't access the variable or are stores) and don't consider it a part of the root set, but it is by no means required to do so.
The very simple answer is b/c is inefficient.
There are many garbage collector algorithms and some may aggressively collect. Some compilers do allocation on the stack but the most obvious in your case is: doSomething() may actually keep (leak) a reference to the object elsewhere.

Are objects cleaned up when references to them are nulled?

public class App1
{
public static void main(String[] args)
{
Point point_1 = new Point(5,5);
Point point_2 = new Point(7,8);
Circle circle_1 = new Circle(point_2, 10);
point_1 = null;
point_2 = null;
}
}
How many object references exist after this code has executed? Why?
After this code has executed, exactly none, since it will have exited :-)
If you mean at the point just before exit, there's a reference on the stack to your circle and a reference in your circle to the second point, assuming the constructor stores it.
Despite formulation problems, the snippet is actually quite instructive on certain aspects of garbage collectibility. Let's take a look at it line-by-line.
Point point_1 = new Point(5,5);
So we've declared a reference variable point_1, and it points to a new Point. Let's assume for now that the constructor of Point doesn't do anything fancy and simply set fields final int x, y with the given values.
Thus, we now have something like this:
Now let's take a look at the next line:
Point point_2 = new Point(7,8);
Now we have something like this:
Now let's take a look at the next line:
Circle circle_1 = new Circle(point_2, 10);
Here again we don't quite know how Circle is implemented, but it's reasonable to assume that it has a final Point center and final int radius fields, and with the Point center specifically, it simply sets the reference to the given Point (i.e. no defensive copying since Point is immutable).
So now we may have something like this:
Then with the next two statements, we set point_1 and point_2 to point to null respectively:
point_1 = null;
point_2 = null;
So now we have something like this:
We can now observe that:
The object [aPoint(5 5)] is no longer reachable
The object [aPoint(7 8)], though no longer refered to by point_2, is still refered to by [aCircle(10)].center.
Garbage collectibility is defined by whether or not an object is reachable by a live reference. The object [aPoint(5 5)], we can strongly assume (based on how we think Point is implemented), is no longer reachable, so it is eligible for collection (it's a garbage! No one can "pick it up" now!).
On the other hand, the object [aPoint(7, 8)] is still referred to by [aCircle(10)].center, so we can say that it's NOT eligible for collection (it's not a garbage! Someone is still "hanging on" to it!).
Moral
So no, definitely setting a reference to null does NOT make the object previously being referred to automatically eligible for collection. It depends on the object itself, whether or not there are any references to the object.
Certainly, though, setting a reference to null CAN help make an object be eligible for collection, e.g. when that reference is the last remaining to the object.
You do NOT however, have to ALWAYS set a reference to null to make garbage collection "works". When variables goes out of scope, the reference is no longer alive, so in those kinds of cases explicitly setting to null is simply redundant code.
The classic example when explicitly setting to null DOES work is the Stack example: when the top element is popped from the Stack, the Stack should no longer refer to the object from its internal data structure.
See also
Effective Java 2nd Edition, Item 6: Eliminate obsolete object references
Related questions
Does variable = null set it for garbage collection
The answer is:
Define what you mean for an object reference to "exist".
It is impossible to know how many object references were even created, without details of the Point and Circle classes.
The answer is irrelevant, because after the main method exits none of the objects will be reachable ... whether or not the references still "exist".
We might infer that at the point in time immediately before the main method returns there will be one reachable reference to a Circle object and one reachable reference to a Point. But one has to make some (reasonable) assumptions about how those two classes are implemented to make that inference. (For example, one has to assume that the respective constructors don't add the Point and Circle reference to some static data structure.)
Are objects cleaned up when references to them are nulled?
No. Objects are cleaned up when the garbage collector runs, and it determines that the objects in question are no longer reachable. In this sense, "reachable" means that you can get to the object by following a chain of references to the object starting from:
a static attribute of some class
a local variable of some method that is currently being executed by some thread
an attribute of some other reachable object, or
an element of some other reachable array.
(I've simplified the explanations of GC and reachability a bit to avoid confusing the OP with things he/she won't understand yet.)

Does variable = null set it for garbage collection

Help me settle a dispute with a coworker:
Does setting a variable or collection to null in Java aid in garbage collection and reducing memory usage? If I have a long running program and each function may be iteratively called (potentially thousands of times): Does setting all the variables in it to null before returning a value to the parent function help reduce heap size/memory usage?
That's old performance lore. It was true back in 1.0 days, but the compiler and the JVM have been improved to eliminate the need (if ever there was one). This excellent IBM article gets into the details if you're interested: Java theory and practice: Garbage collection and performance
From the article:
There is one case where the use of explicit nulling is not only helpful, but virtually required, and that is where a reference to an object is scoped more broadly than it is used or considered valid by the program's specification. This includes cases such as using a static or instance field to store a reference to a temporary buffer, rather than a local variable, or using an array to store references that may remain reachable by the runtime but not by the implied semantics of the program.
Translation: "explicitly null" persistent objects that are no longer needed. (If you want. "Virtually required" too strong a statement?)
The Java VM Spec
12.6.1 Implementing Finalization
Every object can be characterized by two attributes: it may be reachable, finalizer-reachable, or unreachable, and it may also be unfinalized, finalizable, or finalized.
A reachable object is any object that can be accessed in any potential continuing computation from any live thread. Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
Discussion
Another example of this occurs if the values in an object's fields are stored in registers. The program may then access the registers instead of the object, and never access the object again. This would imply that the object is garbage.
The object is reachable if it can be involved in any potential continuing computation. So if your code refers to a local variable, and nothing else refers to it, then you might cause the object to be collected by setting it to null. This would either give a null pointer exception, or change the behaviour of your program, or if it does neither you didn't need the variable in the first place.
If you are nulling out a field or an array element, then that can possibly make sense for some applications, and it will cause the memory to be reclaimed faster. Once case is creating a large array to replace an existing array referenced by a field in a class - if the field in nulled before the replacement is created, then it may relieve pressure on the memory.
Another interesting feature of Java is that scope doesn't appear in class files, so scope is not relevant to reachability; these two methods create the same bytecode, and hence the VM does not see the scope of the created object at all:
static void withBlock () {
int x = 1;
{
Object a = new Object();
}
System.out.println(x+1);
}
static void withoutBlock () {
int x = 1;
Object a = new Object();
System.out.println(x+1);
}
Not necessarily. An object becomes eligible for garbage collection when there are no live threads anymore that hold a reference to the object.
Local variables go out of scope when the method returns and it makes no sense at all to set local variables to null - the variables disappear anyway, and if there's nothing else that holds a reference the objects that the variables referred to, then those objects become eligible for garbage collection.
The key is not to look at just variables, but look at the objects that those variables refer to, and find out where those objects are referenced by your program.
It is useless on local variables, but it can be useful/needed to clear up instance variables that are not required anymore (e.g. post-initialization).
(Yeah yeah, I know how to apply the Builder pattern...)
That could only make some sense in some scenario like this:
public void myHeavyMethod() {
List hugeList = loadHugeListOfStuff(); // lots of memory used
ResultX res = processHugeList(hugeList); // compute some result or summary
// hugeList = null; // we are done with hugeList
...
// do a lot of other things that takes a LOT of time (seconds?)
// and which do not require hugeList
...
}
Here it could make some benefit to uncomment the hugeList = null line, I guess.
But it would certainly make more sense to rewrite the method (perhaps refactoring into two,
or specifying an inner scope).
Setting an object reference to null only makes it eligible for garbage collection.
It does not necessarily free up the memory,which depends on when the garbage collector runs(which depends on JVM).
When the garbage collector runs,it frees up the heap by deleting only the objects which are eligible for garbage collection.
It is a good to have. When you set objects to null, there is a possibility that the object can be garbage collected faster, in the immediate GC cycle. But there is no guaranteed mechanism to make an object garbage collected at a given time.

Categories

Resources