Why is an 'invisible' object not instantly collected?

Why is an 'invisible' object not instantly collected? - java

I just read this article: The Truth About Garbage Collection
In section "A.3.3 Invisible" it is explained how and when an object gets into the invisible state.
In the below code, the object assigned to the variable foo will become invisible after leaving the try/catch block and will remainly strongly referenced until the run method exits (which will never happen, because the while loop runs forever).
public void run() {
try {
Object foo = new Object();
foo.doSomething();
} catch (Exception e) {
// whatever
}
while (true) { // do stuff } // loop forever
}
It is stated in this article:
However, an efficient implementation of the JVM is unlikely to zero
the reference when it goes out of scope.
Why is that not efficient?
My attempt at an explanation is as follows:
Say the stack for this method contains four elements, with the now invisible object being at the bottom.
If you want to collect the object instantly, you would have to pop and store three elements, pop and discard the fourth element and then push the three still valid elements back onto the stack.
If you collect the invisible object after control flow has left the run method, the VM could simply pop all four elements and discard them.

The local variables are not on the operand stack, but in the local variables area in the activation frame, accessed, in the case of references via aload and astore bytecodes and zeroing a local variable does not involve any pushing and popping.
Zeroing is inefficient because it is not needed:
it would not cause an immediate garbage collection cycle
the zero may soon be overwritten by another value as dictated by the logic of the program.
going out of the scope means that the local variable is no longer part of the root set for garbage collection. As such what value it held immediately before going out of scope - zero or a valid reference - is immaterial; it won't be examined anyway.
EDIT:
Some comments on the last statement.
Indeed, at a bytecode level there are no scopes and a local variable slot may remain a part of the root set until the method returns. Of course, a JVM implementation can determine when a local variable slot is dead (i.e. all possible paths to method return either don't access the variable or are stores) and don't consider it a part of the root set, but it is by no means required to do so.

The very simple answer is b/c is inefficient.
There are many garbage collector algorithms and some may aggressively collect. Some compilers do allocation on the stack but the most obvious in your case is: doSomething() may actually keep (leak) a reference to the object elsewhere.

Related

How can I make an Object live in memory only for some part of a function in JAVA?

I have a function which runs for a long time and I declare and assign an object inside that function. Now from what I think I know this object will live in memory atleast for as long as this function is running and after that it will be available for garbage collection if no other object references it. Now, I want this object to be available for garbage collection before even function is completed running. In the coding mannger:
public void foo(){
String title;
Object goo=getObject(); //getObject is some other function
//which returns Object and can be null and I want Object to be in memory from here
if(goo!=null)title=goo.getString("title");
//after the last line of code I want Object to be available for garbage
// collection because below is a long running task which doesn't require Object.
for(int i=0;i <1000000;i++) //long running task
{
//some mischief inside with title
}
}
Now from the above task what I want Object to be available for garbage collection before the for loop starts. Can I do something like enclose two lines of code before for loop to be inside curly braces, If no what can I do to achieve this task ?

The object will not stay around during the loop even if you do not do anything. Java will figure out that the object is not reachable once your code passes the point of last access.
A reachable object is any object that can be accessed in any potential continuing computation from any live thread. JVMs are pretty smart these days at figuring out reachability. If it sees that there is no code continuation that could possibly access goo during or after the loop, Java makes the object eligible for garbage collection after the point of last access. Even if the last access checker is very conservative, having no accesses to goo is sufficient to convince JVM that the object is no longer reachable.
See this Q&A for more details.

After you have finished with the object, assigning null to goo will make it no longer reachable via goo. If there is nothing else keeping it reachable, it will then be eligible for garbage collection.
Arranging your code so that the goo variable goes out of scope might have the same effect, but that approach depends on the JIT compiler / GC's ablity to optimize away the tracing of out-of-scope variables. That is platform dependent.

You could try assigning
goo = null;
after you get the information you need from it and gc may collect the object if nothing else references it.
For some other ideas see Marking an Object To Be Removed In The GC

You can put the variable, function call and if test in a block. Like,
{ // <-- add this
Object goo = getObject();
if (goo != null) { // <-- just because you can omit braces, doesn't mean
// you should.
title = goo.getString("title");
}
} // <-- and this, and then goo is out of scope.

Eligibility of garbage collection

This code is a part of my class test -
class Bar { }
class Test
{
Bar doBar()
{
Bar b = new Bar(); /* Line 6 */
return b; /* Line 7 */
}
public static void main (String args[])
{
Test t = new Test(); /* Line 11 */
Bar newBar = t.doBar(); /* Line 12 */
System.out.println("newBar");
newBar = new Bar(); /* Line 14 */
System.out.println("finishing"); /* Line 15 */
}
}
At what point is the Bar object, created on line 6, eligible for garbage collection? Is it when doBar() completes?

All references to the Bar object created on line 6 are destroyed when a new reference to a new Bar object is assigned to the variable newBar on line 14. Therefore the Bar object, created on line 6, is eligible for garbage collection after line 14.
It isn't when doBar() completes because the reference in the doBar() method is returned on line 7 and is stored in newBar on line 12. This preserver the object created on line 6.

Questions of this kind seems to be very popular when teaching Java to beginners, but are complete nonsense. While an answer like this is great for making, whoever invented this question, happy, it’s not correct when digging deeper:
References to objects held in local variables do not prevent objects from being garbage collected. If the subsequent code does not touch them, these object may still be considered unreachable, which is not a theoretical issue. As demonstrated in “finalize() called on strongly reachable object in Java 8”, even the ongoing execution of a method of that object does not hinder its collection, if the instance is not subsequently touched.
Since in your example code, the objects do not hold any data, it’s obvious that they are never touched at all, which implies that they might get collected at any time, depending on the optimization state of the JVM. The code might also get optimized to a degree that it only performs the visible side effect of the two print statements, in other words, that the objects are never created at all.
This is backed by The Java® Language Specification, § 12.6.1. Implementing Finalization:
Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
Another example of this occurs if the values in an object's fields are stored in registers. The program may then access the registers instead of the object, and never access the object again. This would imply that the object is garbage. …
Practically, since this is trivial short-running code inside a main method, the most common scenario would be that the code runs unoptimized, but no garbage collection will ever happen in that short run time.
This leads to the other reason why asking such questions is nonsense. It might be challenging to find the right points when an object is naively considered unreachable, it’ll be impossible to guess when and how the optimizer influences the outcome, but the entire purpose of the garbage collection is that the developer doesn’t have to worry about that.

After assignment at lin 14 is executed. At that point there is no references to Bar object from line 6 exists anymore.

I'd agree with jiltedpotato (It would be after line 14). An object is eligible for garbage collection, when all references to it are lost or discarded.
Take a look at: When Is The Object Eligible For Garbage Collection?

As simple as when there is no reference to an object will be eligible for garbage collection but when will it exactly be collected is dependent on GC algo.

Can Java garbage collect variables before end of scope?

Suppose we have a program like this:
void main() {
// Point 0
BigThing bt = new BigThing();
// Point 1
WeakReference<BigThing> weak = new WeakReference<>(bt);
// Point 2
doSomething(weak);
// Point 3
}
void doSomething(...) { ... }
We know that the weak reference to the BigThing object cannot prevent the object from being garbage collected when it becomes no longer strongly reachable.
My question is about the local variable bt which is a strong reference to the BigThing object. Does the object become not-strongly-reachable at point 2 (just before calling doSomething()) or at point 3 (end of block scope)?
The answer to this question will affect whether the call to doSomething() is guaranteed to be able to access the live BigThing object, or whether the underlying object can die during the function call.
I am uncertain because you could argue that after point 2, the local variable bt is never read or written anymore, so the variable is effectively dead and the pointer value can be discarded. This "optimization" would be valid if all references were strong, but the reasoning falls apart when the notions of soft, weak, and phantom references are introduced, and finalizers as well. Also as an analogy, because C++ has destructors, a value must be destructed at the end of the scope, and cannot be moved ahead to the point of last usage.

I would say the object is collectable at point 2, going by the following language in JLS section 12.6.1:
Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
Since the bt variable will no longer be used after point 2, Java is free to clear that variable, rendering the BigThing object only weakly reachable.

Java 9 introduces Reference.reachabilityFence to solve this case, which of course also implies that it does exist in the first place.

How are local variable kept in Memory?

I would like to know how to place a local variable in memory? In method1, do the variable take a place into memory, one time?
In method2, do the variable take a place after deleting old place in memory, for each time?
public void method1() {
Object obj = null;
for(.....) {
obj = come from other-->
}
}
public void method2() {
for(.....) {
Object obj = come from other-->
}
}

You have a local variables which may be in a register or once in memory.
You also have an object which the local variable references. This will be created on each iteration in both cases.
They are effectively the same, except I would prefer the second case if it is possible as it narrows the scope of the local variable.

Each method call is associated with Activation Record that is stored on a call stack. The activation record holds references to the memory blocks in heap corresponding to the method level variables. Once the method call returns to the caller, this activation record will be removed from the stack and the memory references are potentially available to be garbage-collected.
In your case,
the obj in the first method, it's reference is stored in the call stack and the actual memory is on the heap and this is done once per method call.
the obj in the for loop in the second method is created once for each iteration and goes out of scope at the end of each iteration. So, the reference and the memory on the heap are allocated for each iteration.

The local variables are usually (unless e.g. optimized away) kept on the stack memory. But they can only store primitive values or references. The referenced objects themselves are usually allocated on the heap (withstanding any JIT optimization).
See Stack based memory allocation (Wikipedia) vs. Heap based memory allocation (Wikipedia).
Storing values on the stack is very cheap. Similar to a function call, where you store the return pointer on the stack. It does not require much more than incrementing the stack pointer (and you can imagine that incrementing a dedicated CPU register is fast!)
The object itself is different. Note that theoretically, some java compiler or JIT might be able to optimize your second code better, because you indicate clearly that the value is not needed for the next iteration. (An even better compiler should be able to figure this out itself.)
In general, a modern compiler should produce the same machine code after optimization for both cases. (This may happen in the JIT compiler, so the Java bytecode may still show the difference).
Anyway: do not try to overoptimize by reusing local variables. Instead, write explicit code and let the compiler optimize. By using a fresh variable inside the loop, you make it explicit that it is not reused anywhere. This can prevent some programming errors!

I believe in both cases a new Object is created in memory for every iteration. It is up to the garbage collector to notice that there are no references to any but the most 'recent' Object.

Objects in method1 and method2 will be placed in heap, but java compiler perform Escape analysis for determination we need release this kind of object after method execution or not. Escape analysis is implemented in Java Standard Edition 6

Does variable = null set it for garbage collection

Help me settle a dispute with a coworker:
Does setting a variable or collection to null in Java aid in garbage collection and reducing memory usage? If I have a long running program and each function may be iteratively called (potentially thousands of times): Does setting all the variables in it to null before returning a value to the parent function help reduce heap size/memory usage?

That's old performance lore. It was true back in 1.0 days, but the compiler and the JVM have been improved to eliminate the need (if ever there was one). This excellent IBM article gets into the details if you're interested: Java theory and practice: Garbage collection and performance

From the article:
There is one case where the use of explicit nulling is not only helpful, but virtually required, and that is where a reference to an object is scoped more broadly than it is used or considered valid by the program's specification. This includes cases such as using a static or instance field to store a reference to a temporary buffer, rather than a local variable, or using an array to store references that may remain reachable by the runtime but not by the implied semantics of the program.
Translation: "explicitly null" persistent objects that are no longer needed. (If you want. "Virtually required" too strong a statement?)

The Java VM Spec
12.6.1 Implementing Finalization
Every object can be characterized by two attributes: it may be reachable, finalizer-reachable, or unreachable, and it may also be unfinalized, finalizable, or finalized.
A reachable object is any object that can be accessed in any potential continuing computation from any live thread. Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
Discussion
Another example of this occurs if the values in an object's fields are stored in registers. The program may then access the registers instead of the object, and never access the object again. This would imply that the object is garbage.
The object is reachable if it can be involved in any potential continuing computation. So if your code refers to a local variable, and nothing else refers to it, then you might cause the object to be collected by setting it to null. This would either give a null pointer exception, or change the behaviour of your program, or if it does neither you didn't need the variable in the first place.
If you are nulling out a field or an array element, then that can possibly make sense for some applications, and it will cause the memory to be reclaimed faster. Once case is creating a large array to replace an existing array referenced by a field in a class - if the field in nulled before the replacement is created, then it may relieve pressure on the memory.
Another interesting feature of Java is that scope doesn't appear in class files, so scope is not relevant to reachability; these two methods create the same bytecode, and hence the VM does not see the scope of the created object at all:
static void withBlock () {
int x = 1;
{
Object a = new Object();
}
System.out.println(x+1);
}
static void withoutBlock () {
int x = 1;
Object a = new Object();
System.out.println(x+1);
}

Not necessarily. An object becomes eligible for garbage collection when there are no live threads anymore that hold a reference to the object.
Local variables go out of scope when the method returns and it makes no sense at all to set local variables to null - the variables disappear anyway, and if there's nothing else that holds a reference the objects that the variables referred to, then those objects become eligible for garbage collection.
The key is not to look at just variables, but look at the objects that those variables refer to, and find out where those objects are referenced by your program.

It is useless on local variables, but it can be useful/needed to clear up instance variables that are not required anymore (e.g. post-initialization).
(Yeah yeah, I know how to apply the Builder pattern...)

That could only make some sense in some scenario like this:
public void myHeavyMethod() {
List hugeList = loadHugeListOfStuff(); // lots of memory used
ResultX res = processHugeList(hugeList); // compute some result or summary
// hugeList = null; // we are done with hugeList
...
// do a lot of other things that takes a LOT of time (seconds?)
// and which do not require hugeList
...
}
Here it could make some benefit to uncomment the hugeList = null line, I guess.
But it would certainly make more sense to rewrite the method (perhaps refactoring into two,
or specifying an inner scope).

Setting an object reference to null only makes it eligible for garbage collection.
It does not necessarily free up the memory,which depends on when the garbage collector runs(which depends on JVM).
When the garbage collector runs,it frees up the heap by deleting only the objects which are eligible for garbage collection.

It is a good to have. When you set objects to null, there is a possibility that the object can be garbage collected faster, in the immediate GC cycle. But there is no guaranteed mechanism to make an object garbage collected at a given time.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.