Are invisible references still an issue in recent JVMs? - java

I was reading Java Platform Performance (sadly the link seems to have disappeared from the internet since I originally posed this question) and section A.3.3 worried me.
I had been working on the assumption that a variable that dropped out of scope would no longer be considered a GC root, but this paper appears to contradict that.
Do recent JVMs, in particular Sun's 1.6.0_07 version, still have this limitation? If so, then I have a lot of code to analyse...
I ask the question because the paper is from 1999 - sometimes things change, particularly in the world of GC.
As the paper is no longer available, I'd like to paraphrase the concern. The paper implied that variables that were defined inside a method would be considered a GC root until the method exited, and not until the code block ended. Therefore setting the variable to null was necessary to permit the Object referenced to be garbage collected.
This meant that a local variable defined in a conditional block in the main() method (or similar method that contained an infinite loop) would cause a one-off memory leak unless you nulled a variable just before it dropped out of scope.
The code from the chosen answer illustrates the issue well. On the version of the JVM referenced in the document, the foo object can not be garbage collected when it drops out of scope at the end of the try block. Instead, the JVM will hold open the reference until the end of the main() method, even though it is impossible for anything to use that reference.
This appears to be the origin of the idea that nulling a variable reference would help the garbage collector out, even if the variable was just about to drop out of scope.

This code should clear it up:
public class TestInvisibleObject{
public static class PrintWhenFinalized{
private String s;
public PrintWhenFinalized(String s){
System.out.println("Constructing from "+s);
this.s = s;
}
protected void finalize() throws Throwable {
System.out.println("Finalizing from "+s);
}
}
public static void main(String[] args) {
try {
PrintWhenFinalized foo = new PrintWhenFinalized("main");
} catch (Exception e) {
// whatever
}
while (true) {
// Provoke garbage-collection by allocating lots of memory
byte[] o = new byte[1024];
}
}
}
On my machine (jdk1.6.0_05) it prints:
Constructing from main
Finalizing from main
So it looks like the problems has been fixed.
Note that using System.gc() instead of the loop does not cause the object to be collected for some reason.

The problem is still there. I tested it with Java 8 and could prove it.
You should note the following things:
The only way to force a guaranteed garbage collection is to try an allocation which ends in an OutOfMemoryError as the JVM is required to try freeing unused objects before throwing. This however does not hold if the requested amount is too large to ever succeed, i.e. excesses the address space. Trying to raise the allocation until getting an OOME is a good strategy.
The guaranteed GC described in Point 1 does not guaranty a finalization. The time when finalize() methods are invoked is not specified, they might be never called at all. So adding a finalize() method to a class might prevent its instances from being collected, so finalize is not a good choice to analyse GC behavior.
Creating another new local variable after a local variable went out of scope will reuse its place in the stack frame. In the following example, object a will be collected as its place in the stack frame is occupied by the local variable b. But b last until the end of the main method as there is no other local variable to occupy its place.
import java.lang.ref.*;
public class Test {
static final ReferenceQueue<Object> RQ=new ReferenceQueue<>();
static Reference<Object> A, B;
public static void main(String[] s) {
{
Object a=new Object();
A=new PhantomReference<>(a, RQ);
}
{
Object b=new Object();
B=new PhantomReference<>(b, RQ);
}
forceGC();
checkGC();
}
private static void forceGC() {
try {
for(int i=100000;;i+=i) {
byte[] b=new byte[i];
}
} catch(OutOfMemoryError err){ err.printStackTrace();}
}
private static void checkGC() {
for(;;) {
Reference<?> r=RQ.poll();
if(r==null) break;
if(r==A) System.out.println("Object a collected");
if(r==B) System.out.println("Object b collected");
}
}
}

The article states that:
... an efficient implementation of the
JVM is unlikely to zero the reference
when it goes out of scope
I think this happens because of situations like this:
public void doSomething() {
for(int i = 0; i < 10 ; i++) {
String s = new String("boo");
System.out.println(s);
}
}
Here, the same reference is used by the "efficient JVM" in each declaration of String s, but there will be 10 new Strings in the heap if the GC doesn't kick in.
In the article example I think that the reference to foo keeps in the stack because the "efficient JVM" thinks that is very likely that another foo object will be created and, if so, it will use the same reference. Thoughts???
public void run() {
try {
Object foo = new Object();
foo.doSomething();
} catch (Exception e) {
// whatever
}
while (true) { // do stuff } // loop forever
}
I've also performed the next test with profiling:
public class A {
public static void main(String[] args) {
A a = new A();
a.test4();
}
public void test1() {
for(int i = 0; i < 10 ; i++) {
B b = new B();
System.out.println(b.toString());
}
System.out.println("b is collected");
}
public void test2() {
try {
B b = new B();
System.out.println(b.toString());
} catch (Exception e) {
}
System.out.println("b is invisible");
}
public void test3() {
if (true) {
B b = new B();
System.out.println(b.toString());
}
System.out.println("b is invisible");
}
public void test4() {
int i = 0;
while (i < 10) {
B b = new B();
System.out.println(b.toString());
i++;
}
System.out.println("b is collected");
}
public A() {
}
class B {
public B() {
}
#Override
public String toString() {
return "I'm B.";
}
}
}
and come to the conclusions:
teste1 -> b is collected
teste2 -> b is invisible
teste3 -> b is invisible
teste4 -> b is collected
... so I think that, in loops, the JVM doesn't create invisible variables when the loop ends because it's unlikely they will be declared again outside the loop.
Any Thoughts??

Would you really have that much code to analyse? Basically I can only see this being a significant problem for very long-running methods - which are typically just the ones at the top of each thread's stack.
I wouldn't be at all surprised if it's unfixed at the moment, but I don't think it's likely to be as significant as you seem to fear.

Related

Multiple Threads accessing instance method from different Instances should cause a race condition?

I am trying to understand Synchornized in Java.
I understood if I have access a synchronized method on same object from 2 different Threads, only one will be able to access at a time.
But I think if the same method is being called on 2 different instances, Both Objects should be able to access the method parallel. Which would cause race condition if accessing/modifying a static member variable from the method. But I am not able to see the race condition happening in below code.
Could someone please explain whats wrong with the code or my understanding.
For reference code is accessible at : http://ideone.com/wo6h4R
class MyClass
{
public static int count=0;
public int getCount()
{
System.out.println("Inside getcount()");
return count;
}
public synchronized void incrementCount()
{
count=count+1;
}
}
class Ideone
{
public static void main(String[] args) throws InterruptedException {
final MyClass test1 = new MyClass();
final MyClass test2 = new MyClass();
Thread t1 = new Thread() {
public void run()
{
int k=0;
while (k++<50000000)
{
test1.incrementCount();
}
}
};
Thread t2 = new Thread() {
public void run()
{
int l=0;
while (l++<50000000)
{
test2.incrementCount();
}
}
};
t1.start();
t2.start();
t1.join();
t2.join();
//System.out.println(t2.getState());
int x=500000000+500000000;
System.out.println(x);
System.out.println("count = " + MyClass.count);
}
}
You're right that the race condition exists. But the racy operations are so quick that they're unlikely to happen -- and the synchronized keywords are likely providing synchronization "help" that, while not required by the JLS, hide the races.
If you want to make it a bit more obvious, you can "spell out" the count = count + 1 code and put in a sleep:
public synchronized void incrementCount()
{
int tmp = count + 1;
try {
Thread.sleep(500);
} catch (InterruptedException e) {
e.printStackTrace();
}
count=tmp;
}
That should show the races more easily. (My handling of the interrupted exception is not good for production code, btw; but it's good enough for small test apps like this.)
The lesson learned here is: race conditions can be really hard to catch through testing, so it's best to really understand the code and prove to yourself that it's right.
Since syncrhonized methods actually synchronize on this different instance methods will lock on different objects and therefore you will get race conditions since they don't block each other.
You probably have to make your own lock object and lock on that.
class MyClass
{
public static int count=0;
//this is what you lock on
private static Object lock = new Object();
public int getCount()
{
synchronized(lock){
System.out.println("Inside getcount()");
return count;
}
}
public void incrementCount()
{
synchronized(lock){
count = count+1;
}
}
//etc
Now when you run your main, this gets printed out:
1000000000
count = 100000000
Here's the relevant section of the Java specification:
"A synchronized method acquires a monitor (§17.1) before it executes. For a class (static) method, the monitor associated with the Class object for the method's class is used. For an instance method, the monitor associated with this (the object for which the method was invoked) is used."
However I fail to see where the MyClass' instances are actually incrementing "count" so what exactly are you expecting to show as a race condition?
(Taken originally from this answer)

getting wrong value when using volatile keyword in java

I have doubt in the behaviour of volatile keyword.
public class TestClassI extends Thread {
private volatile int i=5;
boolean flag;
public TestClassI( boolean flag) {
this.i=i;
this.flag=flag;
}
public void run()
{
if(flag)
{
while(true)
{
System.out.println(i);
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
else
{
i=10;
}
}
}
and in main class using as
public class TestMain {
public static volatile int i=5;
public static void main(String args[]) throws InterruptedException
{
TestClassI test= new TestClassI( true);
test.start();
Thread.sleep(1000);
TestClassI test2=new TestClassI( false);
test2.start();
}
}
I expected the value will be like
5
5
5
5
10
10
10
10.
But it is keep on giving 5. But asper nature of volatile value of i should be stored and retrieved from main memory each time. Please explain is there anything wrong in this code?
You have two instances of TestClassI. They each have their own version of i (because it is an instance field). They don't interfere at all with eachother.
The static TestMain.i is not used in the program at all. There is no shared state.
When the object is instantiated with a false flag, the run method initializes i to 10, and returns. i is never printed:
if (flag) {
// irrelevant
}
else {
i = 10;
}
You don't set i=10in your loop, so it will be set as the only action in your thread, but never printed...
volatile variables have no effect in this case because the variable is an int. long and double variables are affected by volatile, they are loaded and stored in one operation that can't be interrupted rather than two, one for each word. So the volatile variables have no effect.
What is happening that produces the unusual output is that the i is not shared among your threads. If you look, there is a separate i for each thread, although there is a static i in the main class. FYI, the i in the main class is left unnoticed by the threads, who don't notice it at all. If you'd like it to work, one simple modification will suffice: make the i variable in the thread static. Not in the main class, in the thread.

Making this visibility example fail

In Java Concurrency in Practice one of the examples that I think surprises people (at least me) is something like this:
public class Foo {
private int n;
public Foo(int n) {
this.n = n;
}
public void check() {
if (n != n) throw new AssertionError("huh?");
}
}
The surprise (to me at least) was the claim that this is not thread safe, and not only it's not safe, but also there is a chance that the check method will throw the assertion error.
The explanation is that without synchronization / marking n as volatile, there is no visibility guarantee between different threads, and that the value of n can change while a thread is reading it.
But I wonder how likely it is to happen in practice. Or better, if I could replicate it somehow.
So I was trying to write code that will trigger that assertion error, without luck.
Is there a straight forward way to write a test that will prove that this visibility issue is not just theoretical?
Or is it something that changed in the more recent JVMs?
EDIT: related question: Not thread safe Object publishing
But I wonder how likely it is to happen in practice.
Highly unlikely esp as the JIT can turn n into a local variable and only read it one.
The problem with highly unlikely thread safety bugs is that one day, you might change something which shouldn't matter, like your choice of processor or JVM and suddenly your code breaks randomly.
Or better, if I could replicate it somehow.
There is not guarantee you can reproduce it either.
Is there a straight forward way to write a test that will prove that this visibility issue is not just theoretical?
In some cases, yes. but this one is a hard one to prove, partly because the JVM is not prevented from being more thread safe than the JLS says is the minimum.
For example, the HotSpot JVM often does what you might expect, not just the minimum in the docuemtation. e.g. System.gc() is only a hint according to the javadoc, but by default the HotSpot JVM will do it every time.
This scenario is VERY unlikely, but still possible. The thread would have to be paused in the very moment the first n in the comparison is loaded, before the second one is and they are compared - and this takes so tiny a fraction of a second that you would have to be super lucky to hit it. But if you write this code in a super critical application that millions of users are going to use everyday, worldwide, it will happen sooner or later - it is only a matter of time.
There is no guarantee you can reproduce it - maybe it is even not possible on your machine. It depends on your platform, the VM, the Java compiler, etc...
You can convert the first n into a local variable, then pause the thread (Sleep), and have second thread change the n before you do the comparison. But I thing this scenario would defeat the purpose of demonstrating your case.
If a Foo is published unsafely, theoretically another thread could observe two different values when it reads n twice.
The following program could fail because of that reason.
public static Foo shared;
public static void main(String[] args)
{
new Thread(){
#Override
public void run()
{
while(true)
{
Foo foo = shared;
if(foo!=null)
foo.check();
}
}
}.start();
new Thread(){
#Override
public void run()
{
while(true)
{
shared = new Foo(1); // unsafe publication
}
}
}.start();
}
However it's almost impossible to observe that it fails; VM likely optimizes n!=n to false without actually reading n twice.
But we can show an equivalent program, i.e. a valid transformation of the previous program as far as Java Memory Model is concerned, and observe that it fails immediately
static public class Foo
{
int n;
public Foo()
{
}
public void check()
{
int n1 = n;
no_op();
int n2 = n;
if (n1 != n2)
throw new AssertionError("huh?");
}
}
// calling this method has no effect on memory semantics
static void no_op()
{
if(Math.sin(1)>1) System.out.println("never");
}
public static Foo shared;
public static void main(String[] args)
{
new Thread(){
#Override
public void run()
{
while(true)
{
Foo foo = shared;
if(foo!=null)
foo.check();
}
}
}.start();
new Thread(){
#Override
public void run()
{
while(true)
{
// a valid transformation of `shared=new Foo(1)`
Foo foo = new Foo();
shared = foo;
no_op();
foo.n = 1;
}
}
}.start();
}

Strange behavior of JAVAGC

i have the following code :
public class MyOjbect {
public Integer z = 111;
#Override
protected void finalize() throws Throwable {
System.out.println("invoking GC in MyOjbect");
super.finalize();
}
}
public class GC {
private MyOjbect o;
private void doSomethingElse(MyOjbect obj) {
o = obj;
}
#SuppressWarnings("unused")
public void doSomething() throws InterruptedException {
System.out.println("Start");
MyOjbect o = new MyOjbect();
doSomethingElse(o);
o = new MyOjbect();
doSomethingElse(null);
System.gc();
// System.out.println("checking "+o.z);
}
public static void main(String[] args) throws InterruptedException {
GC gc = new GC();
gc.doSomething();
}
}
I wonder why the GC garbage the o variable after executing the doSomethingElse method. Even the o variable is not yet null. In fact, when i debug the code o after doSomethingElse is not null but the GC garbage it. In addition if i uncomment the last line, GC print the o.z variable and after that invoke the GC.
Updated : For people who asks why the local variable is the same as the field.I have just copied a question from the SCJP Test Exam as it is
Lots of subjects to discuss here !
First, as Gyro said, The GC does not collect variables. It collects instances of dead objects. A dead object is an object that has no strong reference (variable) that leads to it. Note that there are more subtle cases (Weak references, Soft references, Phantom references, ...), but let's focus on the most common case :-) You can find more information about this here : https://weblogs.java.net/blog/2006/05/04/understanding-weak-references
If you uncomment the last line, "111" gets printed since o is the local variable that references an instance of MyObject you created with o = new MyOjbect();.
Now, the trickiest thing : you have two different instances of MyObject. However, your program only print once the "invoking GC in MyObject". It becomes evident if you transform your "MyObject" class like this :
public class MyOjbect {
public Integer z = 111;
public MyOjbect() {
System.out.println("Creating MyObject " + hashCode());
}
#Override
protected void finalize() throws Throwable {
System.out.println("invoking GC in MyOjbect " + hashCode());
super.finalize();
}
}
You program now prints two MyObjects creation but only one that is recovered by the GC. This is because there is absolutely no guarantee that the finalize() method will be called. According to the JLS and the javadoc of finalize() :
The Java programming language does not guarantee which thread will
invoke the finalize method for any given object
In your case, the end of the application makes every object dead. No need to run a GC since the heap will be completely recovered once the JVM exits.

Why do my variables not go out of scope?

Good afternoon all,
I was taught that when a function returns, The variables (within the scope of that function) automatically go out of scope so we do not have to set them to null.
However, this doesn't seem to be true.
I have a test code that creates a java.lang.ref.PhantomReference pointing to an instance of a java.lang.Object. The only strong reference to that object is within the scope of a function F.
In other words, when that function returns, there should no longer be any strong reference to that object, and the object should now be collectible by the the GC.
However, no matter how hard I try to starve the JVM of memory, the GC simply refuses to collect the object. What is surprising is that if I set the variable to null (obj = null;), the GC now collects the object.
What is the explanation behind this oddity?
public class Test {
public static void main(String args[]) {
// currently testing on a 64-bit HotSpot Server VM, but the other JVMs should probably have the same behavior for this use case
Test test = new Test();
test.F(new Object());
}
public <T> void F(T obj) {
java.lang.ref.ReferenceQueue<T> ref_queue = new java.lang.ref.ReferenceQueue<T>();
java.lang.ref.PhantomReference<T> ref = new java.lang.ref.PhantomReference<T>(obj, ref_queue); // if this line isn't an assignment, the GC wouldn't collect the object no matter how hard I force it to
obj = null; // if this line is removed, the GC wouldn't collect the object no matter how hard I force it to
StartPollingRef(ref_queue);
GoOom();
}
private <T> void StartPollingRef(final java.lang.ref.ReferenceQueue<T> ref_queue) {
new java.lang.Thread(new java.lang.Runnable() {
#Override
public void run() {
System.out.println("Removing..");
boolean removed = false;
while (!removed) {
try {
ref_queue.remove();
removed = true;
System.out.println("Removed.");
} catch (InterruptedException e) { // ignore
}
}
}
}).start();
}
private void GoOom() {
try {
int len = (int) java.lang.Math.min(java.lang.Integer.MAX_VALUE, Runtime.getRuntime().maxMemory());
Object[] arr = new Object[len];
} catch (Throwable e) {
// System.out.println(e);
}
}
}
A standards-compliant JVM is never obligated to collect memory. That is to say, you cannot write a program whose correctness depends on a particular bit of memory being collected at a certain time: you can neither force the JVM to collect (even via System.gc()!) nor rely on it doing so.
So, the behavior you're observing cannot, definitionally, be wrong: you're purposefully trying to make the environment do something it is under no onus to do.
That all said, your issue is that your object has not gone out of scope. It is created in main, then passed - in the normal Java referential manner - to F. Until F returns, the T obj name is still a reference to your object.
Make goOom static and put a call to it in main, and you should see the object get collected. But, then again, you might still not, and that wouldn't be wrong...

Categories

Resources