Weak Reference and strong Reference - java

I have a very basic question regarding weak reference and strong reference in Java.
In general Java programming we generally do not create weak reference of object, we create normal strong reference but when we are done with that object we assign null to that object with the conception that, that object will be collected by GC next time.
Is that my understanding is wrong?
After reading some of the articles, it looks like, object is collected by GC If it is null or not referred anywhere if only it has weak reference. I am confused.
In other word what is the difference between these two code snippets, in respect to Java GC?
Snippet 1
Counter counter = new Counter(); // strong reference - line 1
WeakReference<Counter> weakCounter = new WeakReference<Counter> (counter); //weak reference
counter = null;
Snippet 2
Counter counter = new Counter(); // strong reference - line 1
counter = null;

In both cases, counter will be eligible for garbage collection. Even if you use SoftReference, it will be eligible for GC, but it will only be collected reluctantly. (That is, a SoftReference encourages the GC to leave the object in memory, but still allows it to be collected.)
Only hard references force the GC to leave objects alone.
Normally you only need to assign null to a reference if the reference has longer life than you want for the object. Once a hard-reference variable goes out of scope, it is no longer reachable from live code so its hard reference will not prevent the GC from collecting the object.
Note also that there's no guarantee as to when objects eligible for collection will actually be collected by the GC. It may be on the next GC cycle or maybe not. It depends heavily on the implementation of the GC. The only thing you can say for sure is that all eligible objects will be collected before the VM throws an OutOfMemoryError.

The difference is that if I remeber correctly,
The First code snippit assigns the value null to the object counter after it has been assigned to weakCounter. So therfor weakCounter still has the reference to the old counter without the reference to counter being updated. But the counter is still collected by the compiler, even though the weakCounter is assigned to an object reference of Counter
In the second code example, the counter goes from being assigned to a object to a null value letting java know "hey you can collect me in the garbage!"
Hope this made sense and it helped your understanding if some of my facts are wrong please feel free to tell me where I am mistaken :)

The two are essentially equivalent, save for the fact that you might be able to reference the object through the WeakReference if you do so before GC collects it.
The purpose of the WeakReference is so you can have it stashed somewhere (eg, some sort of search index) and not worry about having to clear it if you are done with the object and wish to null any "strong" references (so that the object may be collected and the space reused). If you used an ordinary strong reference you'd have to be sure to clear it or the object would hang around forever.
(SoftReferences, as mentioned by Ted Hopp, are similar in mechanics, except that GC will only collect the referenced objects if storage is tight. This makes them suitable for things like cached internet pages.)

Related

Garbage Collection and InComplete Constructed Object

This may be a very naive Question?
Suppose i have Class Something like this
class SlowConstructor {
private final int a;
private final String unReachableString;
public SlowConstructor(String random) {
unReachableString = "I am not reachable will GC will collect me " + random;
Thread.sleep(1000*3600); // ignoring Exception check for readbility
a = 100;
Thread.sleep(1000*3600);
}
}
So my question is if i create Many Objects of SlowConstructor (let say 50 in diff threads) and as you can see each Constructor will take two hours to complete. The String reference in SlowConstructor unReachableString is not reachable from any code for around two hours. If GC runs during this two hours will it not collect unReachableString ref ?. I assume it will not be Garbage Collected but then why? From where unReachableString is reachable ?
The String reference in SlowConstructor unReachableString is not reachable from any code for around two hours.
Incorrect. The SlowConstructor object is immediately reachable from the thread that is in the process of constructing it. So, therefore, is the string.
So that means that the String object won't be garbage collected before the constructor completes.
(And in fact, the string object corresponds to a String literal, and is therefore also reachable from the code (any code!) that assigns or applies a method to the literal.)
The concept of reachability includes any mechanism by which any current or future execution could use the object in question. That includes cases where the object hasn't been assigned to a named variable or array element ... yet.
As other have said GC is not going to affect a half-constructed object. But why? GC necessarily proceeds from a maximal set of root pointers. Anything that can be reached from these roots is "protected" from GC. This is either my marking as in mark-and-sweep collectors or by copying to a new active generation (arena) in a copying collector. Roots consist of the runtime stack, machine (virtual or physical) registers, and global pointers. When the constructor starts running, a pointer to the newly allocated record will be created. Either it will be a root or accessible from a root. So the GC will not collect it. Since the class instance under construction is accessible from a root, so is the string you're referring to. Therefore it can't be collected either.
So long as the threads weren't interrupted, your object will (eventually) instantiate, and (eventually) contain a value for unReachableString.
Strings are interned, and would be subject to garbage collection only if nothing referred to it - kind of like how garbage collection works now. The half-constructed object does refer to the interned string, so it would not be yet eligible for garbage collection.
I'm willing to bet that having fifty or so instances of this type floating around* wouldn't make a difference either - you then have fifty or so references to this string literal, and it wouldn't be yet eligible for garbage collection until these instances were eligible for garbage collection themselves.
*: OH GOD NO PLEASE DON'T DO THIS IN ACTUAL CODE PLEASE
It will not and should not be garbage collected. Sleeping thread is still a live thread.
Reachable in GC context means the following: if we go through the Stack will we find a reference pointing to this object (memory space) on the Heap.
In you case the answer is yes.
your logic is not correct, if thread is still alive it is in scope of method SlowConstructor. So JVM thinks that unReachableString string can be used so Garbacge Collection does not touch that reference.
According to the code you can assume that unReachableString is not used so it has to be Garbage Collected but JVM does not have intelligent logic to know the next. It just look at the scope of method and object reference.

Java: reliably calling GC?

In some library found on google code I came across this util method:
public static void gc(){
Object obj = new Object();
WeakReference ref = new WeakReference<Object>(obj);
obj = null;
while(ref.get()!=null)
System.gc();
}
Its doc says it provides a reliable way to call GC, because calling System#gc() is just a hint without any guarantees. I showed it my senior, and he said I should think about why this method is invalid.
I read some articles on weak references but I'm still confused.
Can somebody show me the point?
I have direct experience with the supposed "safe GC" idiom you have posted.
It doesn't work.
The reason is quite simple: just the fact that a weak ref is cleared is not a signal that the object has been collected; it only means that it has become unreachable through any strong or soft reference. In my experience this signal arrives before the object is reclaimed.
A better attempt would be to use a Phantom reference, which at least ensures that the object has entered the finalizable state, but once again, it can still be occupying the heap, and in practice it still is occupying it. Another explanation could be that this particular object, obviously residing in Eden space, did get reclaimed, but the GC run which did it was not exhaustive and there is more memory to reclaim in other generations.
On the other hand, I have very reliably used this trivial idiom:
for (int i = 0; i < 3; i++) { System.gc(); Thread.sleep(500); }
Try it and see if it works for you. The sleep part is optional: it is needed only if System.gc() uses concurrent sweeping.
If you object to the apparent fickleness of this approach, just remember that any approach to explicit GC-ing is fickle. This one is at least honest about it—and just happens to work on actual systems. It is, naturally, non-portable and can cease to work at any time for a large array of reasons. Even so, it is the best you'll ever get.
The point is, that System.gc() does not need to clean up all weak references. And consider some Java virtual machines too. If System.gc for once (the first time) does decide to not clean that reference, it is very likely to the next call. Hence you have a possibly infinite loop. Probably depending on other threads changing the state for the garbage collection to terminate the loop.
So: once is enough.
There is no way to guarantee a GC call because just as the documentation says System.gc is just a hint that can be ignored by the system.
So assume that the JVM ignores System.gc - in that case the whole thing just loops until some other part of the systems causes a GC. If you run single-threaded or nobody else allocates much memory you basically create an inifite loop here.
The point is that your thread will stop and wait until the weak reference is cleared, thus "simulating" garbage collection. There's no guarantee when (or indeed even IF) this will actually happen.
You could be stuck waiting on this while for a long, long time.
Programmatic we need to ensure that, when a obj is removed then its corresponding entry should be removed. Only then, that object becomes a candidate for garbage collection. Otherwise, even though it is not used at run-time, this stale object will not be garbage collected.
The object to which this reference refers, or null if this reference object has been cleared.
As your object reference of WeakReference class. so it will not give null. but after removing gc., it provide null.
Object obj = new Object();
WeakReference ref = new WeakReference<Object>(obj);
obj = null;
if(ref.get()!=null)
{
System.gc();
System.out.println("remove ref");
}
if(ref.get()!=null){
System.out.println("not execute");
}
Output:
remove ref
Don't assign null value to obj.
Object obj = new Object();
WeakReference ref = new WeakReference<Object>(obj);
if(ref.get()!=null)
{
System.gc();
System.out.println("remove ref");
}
if(ref.get()!=null){
System.out.println("execute");
}
Output:
remove ref
execute
Code that tries to force GC is usually a sign for an underlying bigger problem (i.e. design issue or missing knowledge on the developers part).
I have seen a few use cases where calling System.gc() in production code actually makes sense, for example, before printing the current memory usage - it doesn't matter if the values are off but we'd like to improve chances the values are as small as possible. Of course, we knew that GC was enabled - we used this to automatically detect memory leaks on a QA system.
In general, calling System.gc() yells "my code is buggy and I don't know how to fix it!".

What are the chances of getting exactly the same object reference twice

I sometimes assume that if oldObject != newObject then the object has changed - which seems a fair assumption in most cases but is it truly a bad assumption?
In short, under what situation could the following code print "Same!"?
static WeakReference<Object> oldO = null;
...
Object o = new Object();
oldO = new WeakReference(o);
// Do some stuff with o - could take hours or even days to complete.
...
// Discard o (or let it go out of scope).
o = null;
// More stuff - could be hours or days later.
...
o = new Object();
// Later still.
if ( o == oldO.get() ) {
System.out.println("Same!");
}
I realise that this is indeed remotely possible because an object reference is essentially the memory address of the object (or could be in some JVM). But how likely is it? Are we talking decades of run-time before it actually happens?
Added
My apologies - please assume that oldO is some form of weak reference that does not stop it from being collected. Perhaps it is Weak as the code (now) suggests or the reference is store in a database or a file somewhere.
(I'm answering what I think what you really wanted to know, rather than the particular snippet you have)
It's implementation dependant. The contract of object reference is that as long as the object is still alive, no other object will compare == with it. This implies that after the object is garbage collected, the VM is free to reuse the same object reference.
Implementation of Java may choose to use an increasing integer for object reference, in which case you can only get the same object reference when the reference counter overflows back to 0. Other implementation may use memory location, which makes it more likely for the same reference to be reused. In any case, you should define your own object identity if that matters.
never will it be the same. oldO will always reference the initial object so it will never be discarded and new object can't have same address.
UPDATE: seems like answer was updated to specify that oldO is a weak reference. In this case, when the object goes away, oldO's reference will become null. This means it will never match another object in the JVM.
It is not possible for them to be equal. You still have a reference to the old object (oldO), so it will never be discarded.
o == oldO means ois the same memory address as oldO. So, that cannot happen unless, at some time, you are doing either o = oldO or oldO = o. By transitivity, doing foo = o; oldO = foo or anything equivalent will achieve the same result, of course.
Firstly, memory address is irrelevant. Java ain't C. Object identity is a JVM implemention - it may, or may not, rely on memory address, but more likely does not, since the JVM is free to move the object around in memory but must maintain its identity.
But regardless. because you hold a reference to the original object, the second one can not be the "same" object.
It will never happen.
Your first object (oldO) is stored at a specific memory location.
Your second object will systematically be referenced at another memery location, as long as oldO is referenced.
So oldO == o will compare both memory addresses, which will always be different.
If you dereference oldO, it will be garbage collected, and you'll eventuelly be able to create a new object at this same address. But you won't be able to compare it with oldO, because it has been dereferenced.
By having a reference to your old object, you prevent it being garbage collected, so you prevent that bit of memory being available for a new object, so they could never be equal.
I wondered if you used a SoftReference you could keep a reference to the old object, while allowing it to be garbage-collected BUT:
1) I assume once the old object is collected, the SoftReference is set to null and
2) this is artificially trying to force the situation, so doesn't really prove anything :-)
According to the documentation, weak references will be cleared when the object is garbage collected. It does not specify what it means to be "cleared", but presumably it is set to null. If it is in fact set to null, null will never be declared == to any object reference, regardless of its memory location.

Defining java object inside a loop , do I need to use null to free memory?

If I have a loop and create a new object inside it
for ( int i ; i < 10 ; i++)
{
MyObject obj = new MuObject();
obj.use();
}
Do I need to say obj = null, inside the loop at the beginning or end to release memory used by that object , or by using "new" that object will be send to GC ? and can I see this in terms of memory usage ?
update : so in case I have big object and long loop , should I assign the object to null or no ?
Check this: http://javarevisited.blogspot.com/2011/04/garbage-collection-in-java.html
"An Object becomes eligible for Garbage collection or GC if its not reachable from any live threads or any static references". After the loop ends, the objects that you created inside the loop do not have any external references pointing to them and are eligible for garbage collection.
EDIT:
If you want to see memory usage, you can profile your application using an IDE that has such a feature. For example, NetBeans has a nice interface that shows live memory usage for object allocation.
EDIT 2:
"so in case I have big object and long loop , should I assign the object to null or no ?"
No, you do not need to do this. Once one iteration of the loop is complete, there are no active references to any objects created in that iteration so it does not matter that you have a long or short loop.
Do I need to say obj = null, inside the loop at the beginning or end to release memory used by that object , or by using "new" that object will be send to GC ?
Neither, really. new only constructs new objects. When there are no references to the object, such as falling out of scope (i.e., not in the loop block), it will be eligible for garbage collection. Note that Java's garbage collector does not immediately collect objects - it does it in batches when it feels that it is required.
and can I see this in terms of memory usage ?
I would suggest looking at VisualVM, including with your JDK. It has a memory view, and a garbage collector view through a plugin.
Note that you cannot rely on the operating system "in use" count - the Java heap will rarely shrink especially if there aren't any major collections.
Nope, you don't need to set obj to null. When it is reassigned by the next loop iteration the previous reference will be garbage (unless something else points to it) and eligible for cleanup. That's the point of automatic garbage collection.
However there are some cases when you have to watch for things to control memory. If you have an static object pointer set it may never get cleaned up. (it's really not garbage since it has a live reference). One common issue is caches; it may hold unto old stale data that never gets cleaned up.
As it is even if you call the GC it will not do it immediately, but for good practice you can do so.
For memory management you can look into the features of IDE's like netbeans, Eclipse, etc.

Does variable = null set it for garbage collection

Help me settle a dispute with a coworker:
Does setting a variable or collection to null in Java aid in garbage collection and reducing memory usage? If I have a long running program and each function may be iteratively called (potentially thousands of times): Does setting all the variables in it to null before returning a value to the parent function help reduce heap size/memory usage?
That's old performance lore. It was true back in 1.0 days, but the compiler and the JVM have been improved to eliminate the need (if ever there was one). This excellent IBM article gets into the details if you're interested: Java theory and practice: Garbage collection and performance
From the article:
There is one case where the use of explicit nulling is not only helpful, but virtually required, and that is where a reference to an object is scoped more broadly than it is used or considered valid by the program's specification. This includes cases such as using a static or instance field to store a reference to a temporary buffer, rather than a local variable, or using an array to store references that may remain reachable by the runtime but not by the implied semantics of the program.
Translation: "explicitly null" persistent objects that are no longer needed. (If you want. "Virtually required" too strong a statement?)
The Java VM Spec
12.6.1 Implementing Finalization
Every object can be characterized by two attributes: it may be reachable, finalizer-reachable, or unreachable, and it may also be unfinalized, finalizable, or finalized.
A reachable object is any object that can be accessed in any potential continuing computation from any live thread. Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
Discussion
Another example of this occurs if the values in an object's fields are stored in registers. The program may then access the registers instead of the object, and never access the object again. This would imply that the object is garbage.
The object is reachable if it can be involved in any potential continuing computation. So if your code refers to a local variable, and nothing else refers to it, then you might cause the object to be collected by setting it to null. This would either give a null pointer exception, or change the behaviour of your program, or if it does neither you didn't need the variable in the first place.
If you are nulling out a field or an array element, then that can possibly make sense for some applications, and it will cause the memory to be reclaimed faster. Once case is creating a large array to replace an existing array referenced by a field in a class - if the field in nulled before the replacement is created, then it may relieve pressure on the memory.
Another interesting feature of Java is that scope doesn't appear in class files, so scope is not relevant to reachability; these two methods create the same bytecode, and hence the VM does not see the scope of the created object at all:
static void withBlock () {
int x = 1;
{
Object a = new Object();
}
System.out.println(x+1);
}
static void withoutBlock () {
int x = 1;
Object a = new Object();
System.out.println(x+1);
}
Not necessarily. An object becomes eligible for garbage collection when there are no live threads anymore that hold a reference to the object.
Local variables go out of scope when the method returns and it makes no sense at all to set local variables to null - the variables disappear anyway, and if there's nothing else that holds a reference the objects that the variables referred to, then those objects become eligible for garbage collection.
The key is not to look at just variables, but look at the objects that those variables refer to, and find out where those objects are referenced by your program.
It is useless on local variables, but it can be useful/needed to clear up instance variables that are not required anymore (e.g. post-initialization).
(Yeah yeah, I know how to apply the Builder pattern...)
That could only make some sense in some scenario like this:
public void myHeavyMethod() {
List hugeList = loadHugeListOfStuff(); // lots of memory used
ResultX res = processHugeList(hugeList); // compute some result or summary
// hugeList = null; // we are done with hugeList
...
// do a lot of other things that takes a LOT of time (seconds?)
// and which do not require hugeList
...
}
Here it could make some benefit to uncomment the hugeList = null line, I guess.
But it would certainly make more sense to rewrite the method (perhaps refactoring into two,
or specifying an inner scope).
Setting an object reference to null only makes it eligible for garbage collection.
It does not necessarily free up the memory,which depends on when the garbage collector runs(which depends on JVM).
When the garbage collector runs,it frees up the heap by deleting only the objects which are eligible for garbage collection.
It is a good to have. When you set objects to null, there is a possibility that the object can be garbage collected faster, in the immediate GC cycle. But there is no guaranteed mechanism to make an object garbage collected at a given time.

Categories

Resources