Good afternoon all,
I was taught that when a function returns, The variables (within the scope of that function) automatically go out of scope so we do not have to set them to null.
However, this doesn't seem to be true.
I have a test code that creates a java.lang.ref.PhantomReference pointing to an instance of a java.lang.Object. The only strong reference to that object is within the scope of a function F.
In other words, when that function returns, there should no longer be any strong reference to that object, and the object should now be collectible by the the GC.
However, no matter how hard I try to starve the JVM of memory, the GC simply refuses to collect the object. What is surprising is that if I set the variable to null (obj = null;), the GC now collects the object.
What is the explanation behind this oddity?
public class Test {
public static void main(String args[]) {
// currently testing on a 64-bit HotSpot Server VM, but the other JVMs should probably have the same behavior for this use case
Test test = new Test();
test.F(new Object());
}
public <T> void F(T obj) {
java.lang.ref.ReferenceQueue<T> ref_queue = new java.lang.ref.ReferenceQueue<T>();
java.lang.ref.PhantomReference<T> ref = new java.lang.ref.PhantomReference<T>(obj, ref_queue); // if this line isn't an assignment, the GC wouldn't collect the object no matter how hard I force it to
obj = null; // if this line is removed, the GC wouldn't collect the object no matter how hard I force it to
StartPollingRef(ref_queue);
GoOom();
}
private <T> void StartPollingRef(final java.lang.ref.ReferenceQueue<T> ref_queue) {
new java.lang.Thread(new java.lang.Runnable() {
#Override
public void run() {
System.out.println("Removing..");
boolean removed = false;
while (!removed) {
try {
ref_queue.remove();
removed = true;
System.out.println("Removed.");
} catch (InterruptedException e) { // ignore
}
}
}
}).start();
}
private void GoOom() {
try {
int len = (int) java.lang.Math.min(java.lang.Integer.MAX_VALUE, Runtime.getRuntime().maxMemory());
Object[] arr = new Object[len];
} catch (Throwable e) {
// System.out.println(e);
}
}
}
A standards-compliant JVM is never obligated to collect memory. That is to say, you cannot write a program whose correctness depends on a particular bit of memory being collected at a certain time: you can neither force the JVM to collect (even via System.gc()!) nor rely on it doing so.
So, the behavior you're observing cannot, definitionally, be wrong: you're purposefully trying to make the environment do something it is under no onus to do.
That all said, your issue is that your object has not gone out of scope. It is created in main, then passed - in the normal Java referential manner - to F. Until F returns, the T obj name is still a reference to your object.
Make goOom static and put a call to it in main, and you should see the object get collected. But, then again, you might still not, and that wouldn't be wrong...
Related
This question already has answers here:
Can java finalize an object when it is still in scope?
(2 answers)
Closed 4 years ago.
When I run the following program
public static void main(String[] args) {
ArrayList<Object> lists = new ArrayList<>();
for (int i = 0; i <200000 ; i++) {
lists.add(new Object());
}
System.gc();
try {
Thread.sleep(Integer.MAX_VALUE);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
and I dump the heap
jmap -dump:live,format=b,file=heap.bin 27648
jhat -J-Xmx2G heap.bin
The ArrayList and the 200000 objects are missing.
I don't know why the JVM knows that the objects will not be used
and why the JVM judges that this GC root is not a reference.
Local variables are not GC roots per se. The Java® Language Specification defines:
A reachable object is any object that can be accessed in any potential continuing computation from any live thread.
It’s obvious that it requires a variable holding a reference to an object, to make it possible to access it in a “potential continuing computation” from a live thread, so the absence of such variables can be used as an easy-to-check sign that an object is unreachable.
But this doesn’t preclude additional effort to identify unreachable objects which are still referenced by local variables. The specification even states explicitly
Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
in the same section.
Whether it actually does, depends on conditions like the current execution mode, i.e. whether the method runs interpreted or has been compiled already.
Starting with Java 9, you can insert explicit barriers, e.g.
public static void main(String[] args) {
ArrayList<Object> list = new ArrayList<>();
for (int i = 0; i <200000 ; i++) {
list.add(new Object());
}
System.gc();
try {
Thread.sleep(Integer.MAX_VALUE);
} catch (InterruptedException e) {
e.printStackTrace();
}
Reference.reachabilityFence(list);
}
This will force the list to stay alive.
An alternative for previous Java versions, is synchronization:
public static void main(String[] args) {
ArrayList<Object> list = new ArrayList<>();
for (int i = 0; i <200000 ; i++) {
list.add(new Object());
}
System.gc();
synchronized(list) {
try {
Thread.sleep(Integer.MAX_VALUE);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
But usually, you want unused objects to become collected as early as possible. Problems may only arise when you use finalize() together with naive assumptions about the reachability.
As you can see in the first block SAVE_HOOK = null; before going inside the if branch, I think SAVE_HOOK == null, so it should not go into SAVE_HOOK.isAlive();. But actually, I test it in the Eclipse:
Eclipse IDE for Java Developers
Version: Neon.3 Release (4.6.3)
Build id: 20170314-1500
Why does this happen?
public class FinalizeEscapeGC {
public static FinalizeEscapeGC SAVE_HOOK = null;
public void isAlive() {
System.out.println("yes, i am still alive :)");
}
#Override
protected void finalize() throws Throwable {
super.finalize();
System.out.println("finalize mehtod executed!");
FinalizeEscapeGC.SAVE_HOOK = this;
}
public static void main(String[] args) throws Throwable {
SAVE_HOOK = new FinalizeEscapeGC();
// ---------------block 1----------------//
SAVE_HOOK = null;
System.gc();
Thread.sleep(500);
if (SAVE_HOOK != null) { // "SAVE_HOOK = null;"
SAVE_HOOK.isAlive(); // why the frist time it can go inside this if branch?
} else {
System.out.println("no, i am dead :(");
}
// ---------------block 1----------------//
// the same as the above block
// ---------------block 2----------------//
SAVE_HOOK = null;
System.gc();
Thread.sleep(500);
if (SAVE_HOOK != null) {
SAVE_HOOK.isAlive();
} else {
System.out.println("no, i am dead :(");
}
// ---------------block 2----------------//
}
}
Result:
finalize mehtod executed!
yes, i am still alive :)
no, i am dead :(
It sounds like you're trying to ask two questions:
Why is SAVE_HOOK not null after the first block?
Why is it null after the second, identical block?
The simple answer to the first question is: Because System.gc() triggered the garbage collector, which in turn ran the finalizer, which explicitly reassigns the reference. But I assume you know that already, and what you mean to ask is: Why wasn't the object reclaimed after the finalizer completed?
The answer is in the documentation to finalize():
After the finalize method has been invoked for an object, no further action is taken until the Java virtual machine has again determined that there is no longer any means by which this object can be accessed [...] at which point the object may be discarded.
So as long as you still have a strong reference to the object, it will never be garbage collected.
The answer to the second question is: You set it to null and it's never reassigned because the finalizer for a particular instance will only ever run once. This is also in the documentation, right after the previous quote:
The finalize method is never invoked more than once by a Java virtual machine for any given object.
It boils down to this: an object can only be finalized once in its lifetime. The finalize method can cause the object to escape deletion the first time, but next time the object is detected as unreachable, it will simply be deleted.
In an object's life lifecycle, it will go through reachable->finalizable->finalized->reclaimed phases. After finalize() method was execute, the object will get in finalized phase, and wait for the next gc trigger, it will be reclaimed. So in the lifecycle, the finalize() method will be exectued only once. Your SAVE_HOOK is a static variable and it was assigned value in finalize() method again. That's why in first if block, it's not null. but the second time, it's dead.
i have the following code :
public class MyOjbect {
public Integer z = 111;
#Override
protected void finalize() throws Throwable {
System.out.println("invoking GC in MyOjbect");
super.finalize();
}
}
public class GC {
private MyOjbect o;
private void doSomethingElse(MyOjbect obj) {
o = obj;
}
#SuppressWarnings("unused")
public void doSomething() throws InterruptedException {
System.out.println("Start");
MyOjbect o = new MyOjbect();
doSomethingElse(o);
o = new MyOjbect();
doSomethingElse(null);
System.gc();
// System.out.println("checking "+o.z);
}
public static void main(String[] args) throws InterruptedException {
GC gc = new GC();
gc.doSomething();
}
}
I wonder why the GC garbage the o variable after executing the doSomethingElse method. Even the o variable is not yet null. In fact, when i debug the code o after doSomethingElse is not null but the GC garbage it. In addition if i uncomment the last line, GC print the o.z variable and after that invoke the GC.
Updated : For people who asks why the local variable is the same as the field.I have just copied a question from the SCJP Test Exam as it is
Lots of subjects to discuss here !
First, as Gyro said, The GC does not collect variables. It collects instances of dead objects. A dead object is an object that has no strong reference (variable) that leads to it. Note that there are more subtle cases (Weak references, Soft references, Phantom references, ...), but let's focus on the most common case :-) You can find more information about this here : https://weblogs.java.net/blog/2006/05/04/understanding-weak-references
If you uncomment the last line, "111" gets printed since o is the local variable that references an instance of MyObject you created with o = new MyOjbect();.
Now, the trickiest thing : you have two different instances of MyObject. However, your program only print once the "invoking GC in MyObject". It becomes evident if you transform your "MyObject" class like this :
public class MyOjbect {
public Integer z = 111;
public MyOjbect() {
System.out.println("Creating MyObject " + hashCode());
}
#Override
protected void finalize() throws Throwable {
System.out.println("invoking GC in MyOjbect " + hashCode());
super.finalize();
}
}
You program now prints two MyObjects creation but only one that is recovered by the GC. This is because there is absolutely no guarantee that the finalize() method will be called. According to the JLS and the javadoc of finalize() :
The Java programming language does not guarantee which thread will
invoke the finalize method for any given object
In your case, the end of the application makes every object dead. No need to run a GC since the heap will be completely recovered once the JVM exits.
Consider an object declared in a method:
public void foo() {
final Object obj = new Object();
// A long run job that consumes tons of memory and
// triggers garbage collection
}
Will obj be subject to garbage collection before foo() returns?
UPDATE:
Previously I thought obj is not subject to garbage collection until foo() returns.
However, today I find myself wrong.
I have spend several hours in fixing a bug and finally found the problem is caused by obj garbage collected!
Can anyone explain why this happens? And if I want obj to be pinned how to achieve it?
Here is the code that has problem.
public class Program
{
public static void main(String[] args) throws Exception {
String connectionString = "jdbc:mysql://<whatever>";
// I find wrap is gc-ed somewhere
SqlConnection wrap = new SqlConnection(connectionString);
Connection con = wrap.currentConnection();
Statement stmt = con.createStatement(ResultSet.TYPE_FORWARD_ONLY,
ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);
ResultSet rs = stmt.executeQuery("select instance_id, doc_id from
crawler_archive.documents");
while (rs.next()) {
int instanceID = rs.getInt(1);
int docID = rs.getInt(2);
if (docID % 1000 == 0) {
System.out.println(docID);
}
}
rs.close();
//wrap.close();
}
}
After running the Java program, it will print the following message before it crashes:
161000
161000
********************************
Finalizer CALLED!!
********************************
********************************
Close CALLED!!
********************************
162000
Exception in thread "main" com.mysql.jdbc.exceptions.jdbc4.CommunicationsException:
And here is the code of class SqlConnection:
class SqlConnection
{
private final String connectionString;
private Connection connection;
public SqlConnection(String connectionString) {
this.connectionString = connectionString;
}
public synchronized Connection currentConnection() throws SQLException {
if (this.connection == null || this.connection.isClosed()) {
this.closeConnection();
this.connection = DriverManager.getConnection(connectionString);
}
return this.connection;
}
protected void finalize() throws Throwable {
try {
System.out.println("********************************");
System.out.println("Finalizer CALLED!!");
System.out.println("********************************");
this.close();
} finally {
super.finalize();
}
}
public void close() {
System.out.println("********************************");
System.out.println("Close CALLED!!");
System.out.println("********************************");
this.closeConnection();
}
protected void closeConnection() {
if (this.connection != null) {
try {
connection.close();
} catch (Throwable e) {
} finally {
this.connection = null;
}
}
}
}
I'm genuinely astonished by this, but you're right. It's easily reproducible, you don't need to muck about with database connections and the like:
public class GcTest {
public static void main(String[] args) {
System.out.println("Starting");
Object dummy = new GcTest(); // gets GC'd before method exits
// gets bigger and bigger until heap explodes
Collection<String> collection = new ArrayList<String>();
// method never exits normally because of while loop
while (true) {
collection.add(new String("test"));
}
}
#Override
protected void finalize() throws Throwable {
System.out.println("Finalizing instance of GcTest");
}
}
Runs with:
Starting
Finalizing instance of GcTest
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2760)
at java.util.Arrays.copyOf(Arrays.java:2734)
at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
at java.util.ArrayList.add(ArrayList.java:351)
at test.GcTest.main(GcTest.java:22)
Like I said, I can hardly believe it, but there's no denying the evidence.
It does make a perverse kind of sense, though, the VM will have figured out that the object is never used, and so gets rid of it. This must be permitted by the spec.
Going back to the question's code, you should never rely on finalize() to clean up your connections, you should always do it explicitly.
As your code is written the object pointed to by "wrap" shouldn't be eligible for garbage collection until "wrap" pops off the stack at the end of the method.
The fact that it is being collected suggests to me your code as compiled doesn't match the original source and that the compiler has done some optimisation such as changing this:
SqlConnection wrap = new SqlConnection(connectionString);
Connection con = wrap.currentConnection();
to this:
Connection con = new SqlConnection(connectionString).currentConnection();
(Or even inlining the whole thing) because "wrap" isn't used beyond this point. The anonymous object created would be eligible for GC immediately.
The only way to be sure is to decompile the code and see what's been done to it.
There are really two different things happening here. obj is a stack variable being set to a reference to the Object, and the Object is allocated on the heap. The stack will just be cleared (by stack pointer arithmetic).
But yes, the Object itself will be cleared by garbage collection. All heap-allocated objects are subject to GC.
EDIT: To answer your more specific question, the Java spec does not guarantee collection by any particular time (see the JVM spec) (of course it will be collected after its last use). So it's only possible to answer for specific implementations.
EDIT: Clarification, per comments
As I'm sure you're aware, in Java Garbage Collection and Finialization are non-deterministic. All you can determine in this case is when wrap is eligible for garbage collection. I'm assuming you are asking if wrap only becomes eligible for GC when the method returns (and wrap goes out of scope). I think that some JVMs (e.g. HotSpot with -server) won't wait for the object reference to be popped from the stack, it will make it eligible for GC as soon as nothing else references it. It looks like this is what you are seeing.
To summarise, you are relying on finalization being slow enough to not finalize the instance of SqlConnection before the method exits. You finalizer is closing a resource that the SqlConnection is no longer responsible for. Instead, you should let the Connection object be responsible for its own finalization.
Will obj be subject to garbage collection before foo() returns?
You cannot be sure obj will be collected before foo returns but it is certainly eligible for collection before foo returns.
Can anyone explain why this happens?
GCs collect unreachable objects. Your object is likely to become unreachable before foo returns.
Scope is irrelevant. The idea that obj stays on the stack until foo returns is an overly-simplistic mental model. Real systems don't work like that.
Here, obj is a local variable in the method and it is popped off the stack as soon as the method returns or exits. This leaves no way to reach the Object object on the heap and hence it will be garbage collected. And the Object object on the heap will be GC'd only after its reference obj is popped off the stack,ie, only after the method finishes or returns.
EDIT:
To answer your update,
UPDATE: Let me make the question more clear.
Will obj be subject to garbage collection before foo() returns?
obj is just a reference to the actual object on the heap. Here obj is declared inside the method foo(). So your question Will obj be subject to garbage collection before foo() returns? doesnot apply as obj goes inside the stack frame when the method foo() is running and is gone when the method finishes.
According to the current spec, there isn't even a happens-before ordering from finalisation to normal use. So, to impose order, you actually need to use a lock, a volatile or, if you are desperate, stashing a reference reachable from a static. There is certainly nothing special about scope.
It should be rare that you actually need to write a finaliser.
I was reading Java Platform Performance (sadly the link seems to have disappeared from the internet since I originally posed this question) and section A.3.3 worried me.
I had been working on the assumption that a variable that dropped out of scope would no longer be considered a GC root, but this paper appears to contradict that.
Do recent JVMs, in particular Sun's 1.6.0_07 version, still have this limitation? If so, then I have a lot of code to analyse...
I ask the question because the paper is from 1999 - sometimes things change, particularly in the world of GC.
As the paper is no longer available, I'd like to paraphrase the concern. The paper implied that variables that were defined inside a method would be considered a GC root until the method exited, and not until the code block ended. Therefore setting the variable to null was necessary to permit the Object referenced to be garbage collected.
This meant that a local variable defined in a conditional block in the main() method (or similar method that contained an infinite loop) would cause a one-off memory leak unless you nulled a variable just before it dropped out of scope.
The code from the chosen answer illustrates the issue well. On the version of the JVM referenced in the document, the foo object can not be garbage collected when it drops out of scope at the end of the try block. Instead, the JVM will hold open the reference until the end of the main() method, even though it is impossible for anything to use that reference.
This appears to be the origin of the idea that nulling a variable reference would help the garbage collector out, even if the variable was just about to drop out of scope.
This code should clear it up:
public class TestInvisibleObject{
public static class PrintWhenFinalized{
private String s;
public PrintWhenFinalized(String s){
System.out.println("Constructing from "+s);
this.s = s;
}
protected void finalize() throws Throwable {
System.out.println("Finalizing from "+s);
}
}
public static void main(String[] args) {
try {
PrintWhenFinalized foo = new PrintWhenFinalized("main");
} catch (Exception e) {
// whatever
}
while (true) {
// Provoke garbage-collection by allocating lots of memory
byte[] o = new byte[1024];
}
}
}
On my machine (jdk1.6.0_05) it prints:
Constructing from main
Finalizing from main
So it looks like the problems has been fixed.
Note that using System.gc() instead of the loop does not cause the object to be collected for some reason.
The problem is still there. I tested it with Java 8 and could prove it.
You should note the following things:
The only way to force a guaranteed garbage collection is to try an allocation which ends in an OutOfMemoryError as the JVM is required to try freeing unused objects before throwing. This however does not hold if the requested amount is too large to ever succeed, i.e. excesses the address space. Trying to raise the allocation until getting an OOME is a good strategy.
The guaranteed GC described in Point 1 does not guaranty a finalization. The time when finalize() methods are invoked is not specified, they might be never called at all. So adding a finalize() method to a class might prevent its instances from being collected, so finalize is not a good choice to analyse GC behavior.
Creating another new local variable after a local variable went out of scope will reuse its place in the stack frame. In the following example, object a will be collected as its place in the stack frame is occupied by the local variable b. But b last until the end of the main method as there is no other local variable to occupy its place.
import java.lang.ref.*;
public class Test {
static final ReferenceQueue<Object> RQ=new ReferenceQueue<>();
static Reference<Object> A, B;
public static void main(String[] s) {
{
Object a=new Object();
A=new PhantomReference<>(a, RQ);
}
{
Object b=new Object();
B=new PhantomReference<>(b, RQ);
}
forceGC();
checkGC();
}
private static void forceGC() {
try {
for(int i=100000;;i+=i) {
byte[] b=new byte[i];
}
} catch(OutOfMemoryError err){ err.printStackTrace();}
}
private static void checkGC() {
for(;;) {
Reference<?> r=RQ.poll();
if(r==null) break;
if(r==A) System.out.println("Object a collected");
if(r==B) System.out.println("Object b collected");
}
}
}
The article states that:
... an efficient implementation of the
JVM is unlikely to zero the reference
when it goes out of scope
I think this happens because of situations like this:
public void doSomething() {
for(int i = 0; i < 10 ; i++) {
String s = new String("boo");
System.out.println(s);
}
}
Here, the same reference is used by the "efficient JVM" in each declaration of String s, but there will be 10 new Strings in the heap if the GC doesn't kick in.
In the article example I think that the reference to foo keeps in the stack because the "efficient JVM" thinks that is very likely that another foo object will be created and, if so, it will use the same reference. Thoughts???
public void run() {
try {
Object foo = new Object();
foo.doSomething();
} catch (Exception e) {
// whatever
}
while (true) { // do stuff } // loop forever
}
I've also performed the next test with profiling:
public class A {
public static void main(String[] args) {
A a = new A();
a.test4();
}
public void test1() {
for(int i = 0; i < 10 ; i++) {
B b = new B();
System.out.println(b.toString());
}
System.out.println("b is collected");
}
public void test2() {
try {
B b = new B();
System.out.println(b.toString());
} catch (Exception e) {
}
System.out.println("b is invisible");
}
public void test3() {
if (true) {
B b = new B();
System.out.println(b.toString());
}
System.out.println("b is invisible");
}
public void test4() {
int i = 0;
while (i < 10) {
B b = new B();
System.out.println(b.toString());
i++;
}
System.out.println("b is collected");
}
public A() {
}
class B {
public B() {
}
#Override
public String toString() {
return "I'm B.";
}
}
}
and come to the conclusions:
teste1 -> b is collected
teste2 -> b is invisible
teste3 -> b is invisible
teste4 -> b is collected
... so I think that, in loops, the JVM doesn't create invisible variables when the loop ends because it's unlikely they will be declared again outside the loop.
Any Thoughts??
Would you really have that much code to analyse? Basically I can only see this being a significant problem for very long-running methods - which are typically just the ones at the top of each thread's stack.
I wouldn't be at all surprised if it's unfixed at the moment, but I don't think it's likely to be as significant as you seem to fear.