Arrays and Garbage collection in Java

Arrays and Garbage collection in Java - java

Let us say I have an Array a, where the array is of type T. Would setting an element to null mark it for garbage collection.
For example, If I do a[36] = null, or do I need to something more, like also set fields in that object of type T tonull?

In Java, objects are stored on the heap, whereas variables/references are stored on the stack. The GC performs what is called a 'cycle', which checks which variables no longer refer to actual datatypes, as well as checking if objects are still referred to in the scope. As mario mentioned, an object will eventually be collected when nothing holds a reference to it, however in some performance/memory critical applications, setting objects to null and trying to speed up the garbage collection process has been known to provide marginal performance benefits. In this case, I wouldn't worry about it too much.

As others said, it's hard to tell without the code. However, this might help:
According to Josh Bloch's Effective Java 2nd Edition chapter 2 item 6, you need to set the reference to null in order to allow it to be GC'd if you manage your own memory. He explains that if you don't null a reference it can become an obsolete references and these can cause an OutOfMemoryError.
The example he gives is the following (I'm shortening it). Consider a stack implementation where you can push and pop objects. The problem manifests in the pop operation:
public class Stack {
private Object[] elements;
private int size = 0;
public Object pop() {
if (size == 0)
throw new EmptyStackException();
Object result = elements[--size];
elements[size] = null; // Eliminate obsolete reference, or you'll have a "memory leak"
return result;
}
Note that you're controlling the allocated size manually with the size variable and the GC can't know which elements are allocated and which are free.
Go ahead and read that section in his book for more information if you find it relevant. Your case has similarities to what I wrote, but we can't be sure without code.

Related

What is an Obsolete reference in java

I'm reading the Effective Java book and its saying that eliminating obsolete reference is one of best way to avoid memory leaks. according to the below program, by doing -> elements[size] = null; its eliminating obsolete references in that program.
My problem here what is the advantage of doing elements[size] = null;. Any other program can use that freed memory location? Or is it garbage collected?
According to my understanding the array is already allocated the memory for its size. Even we do elements[size] = null; anyone can't use that freed memory location until you do elements = null;. Please someone tell me what is advantage of doing elements[size] = null; here.
public Object pop() {
if (size == 0)
throw new EmptyStackException();
Object result = elements[--size];
elements[size] = null; // Eliminate obsolete reference
return result;
}

My problem here what is the advantage of doing elements[size] = null;.
Here obsolete references refer to object references not required any longer for the program.
You want that unnecessary objects to be free to consume only memory that your program need. Generally it is done for the good working of the current application.
Any other program can use that freed memory location?
Theoretically yes but it also depends on the JVM memory options used. You don't generally focus on it.
elements[size] = null and elements = null; don't have at all the same intention and the same effects.
In the context of the book, elements is a structural intern of a class.
The idea is that some elements of the array may be stale and not required any longer after some removal operations.
The first one (elements[size] = null) will make the object of the array element located at the size index to be eligible to be GC if no other objects reference .
But the second one (elements = null) is much more. It will make all elements of the array to be eligible to be GC if no other objects reference it.

There are two cases we have to distinguish:
The outer object is "teared down" somehow, so it closes any open resource and also "voluntarily" releases all objects it had referred to. This s simply the explicit way of telling the jvm that the corresponding refence is "gone". You make it easier for the gc to understand: the corresponding object is eligible for garbage collection. Of course, that only has that effect if there are no other references to the same object elsewhere. And beyond: doing so isn't really required, the jvm/gc must of course be able to detect any eligible object all by itself.
But nullifying makes sense for refences that exist for longer periods of time, pointing to different objects over that time span. Like a container, such as the stack class in the underlying example. A container must forget about objects it referenced to when they get "removed". Otherwise you create a memory leak!

What happens here?
Let's imagine, elements is a 20-elements Object array (elements = new Object[20];), and has been filled with 18 BigInteger instances, the remaining two places being null.
So the heap now contains 18 BigInteger instances and a 20-elements Object[] array. The garbage collector won't reclaim any of these instances, and that's okay as you'll most probably use them later (via the pop() method).
Now you call the pop() method to get the BigInteger most recently added to the array. Let's assume you just want to print it and then forget it, so in your overall application that number isn't needed any more, and you'd expect the garbage collector to reclaim it. But that won't happen unless you do the null assignment
elements[size] = null; // Eliminate obsolete reference
Why that?
As long as you store the reference to an object in some accessible place, the garbage collector believes that you'll still need the object later.
As long as elements[17] still refers to the BigInteger, it can potentially be accessed by your program, so it can't be reclaimed. If elements[17] points to null, the BigInteger that used to be there isn't accessible via elements any more and can be reclaimed by the garbage collector (if no other part of your code still uses it).
Conclusion
It's only worth thinking about "obsolete references" if you have a long-living storage structure that contains fat objects, and you can tell at some point in time that you won't need one of the stored objects any more. As you won't need this object any more, you can now re-assign the storage with null, and then the GC no longer believes you still need the object and is able to reclaim the storage space.

Does referencing array index creates a memory leak?

I am reading "Item 6: Eliminate obsolete object references" of Effective Java second edition.
Below is the code snippet.
//Can you spot the "memory leak"?
public class Stack {
private Object[] elements;
private int size = 0;
private static final int DEFAULT_INITIAL_CAPACITY = 16;
public Stack() {
elements = new Object[DEFAULT_INITIAL_CAPACITY];
}
public void push(Object e) {
ensureCapacity();
elements[size++] = e;
}
public Object pop() {
if (size == 0)
throw new EmptyStackException();
return elements[--size];
}
/**
* Ensure space for at least one more element, roughly doubling the capacity
* each time the array needs to grow.
*/
private void ensureCapacity() {
if (elements.length == size)
elements = Arrays.copyOf(elements, 2 * size + 1);
}
}
As per this item, memory leak is because after popping, array index was not referenced to NULL like below:
public Object pop() {
if (size == 0)
throw new EmptyStackException();
Object result = elements[--size];
elements[size] = null; // Eliminate obsolete reference
return result;
}
My understanding have been that suppose for an given array, I have done elements[0] = new Object() and then I do this again elements[0] = new Object() then my first object would be eligible for garbage collection because 0th index of my array is no more pointing to it.
Is my understanding incorrect? If it is correct then how it is shown as memory leak in Effective Java.

You got most of it.
If you do:
elements[0] = someOtherObject;
then the other element stored at index 0 is no longer referenced and might be collected.
But the first pop() implementation keeps that reference in place - it only decreases the "counter" of stored elements. Therefore that object is still referenced - and won't be collected until a new object is added to the stack!
As the comment in the second version of pop() clearly states - the reference has to be eliminated to ensure that the stack doesn't keep a reference to that very object. The object is supposed to be popped - so the stack should not keep knowledge about that removed object!
And to confirm the commit: yes, when one pushes n object, then pushes n other objects, then you don't have a memory leak - because the underlying array references will all be updated and point to new objects then. And yes, if less than n objects get pushed after popping, stale references are kept and preventing garbage collection here.

The issue pertains to the fact that the array is still holding reference to objects which have only been logically popped off the array(decreasing the size counter). This means that the only way to ever get this memory back would be to garbage collect the entire stack by setting it to null.
You are correct with your case that if you just re-assigned to the nth index it would not be a leak, because you still expect that object to exist. However with pop, your aim is to decrease the size of the stack, which means any memory which was assigned for the top of the stack, should be collected after popping.

Quoting from Effective Java (emphasis mine)
If a stack grows and then shrinks, the objects that were popped off the stack will not be garbage collected, even if the program using the stack has no more references to them. This is because the stack maintains obsolete references to these objects. An obsolete reference is simply a reference that will never be dereferenced again. In this case, any references outside of the “active portion” of the element array are obsolete. The active portion consists of the elements whose index is less than size.
He refers to the references of the elements that are popped off.
But, you are right in your example, when you store the reference to a new Object at index 0, there is no reference to the first Object and hence it is eligible for Garbage creation.
But say,
You create five Objects (elements[0]... elements[4])
You pop three elements. This would leave your top variable (size here) pointing at index 2.
But still, you would have 5 active references which would prevent the last three objects from being garbage collected.

The term "Memory Leak" was borrowed from C and is often misused in Java. Memory Leak in C sense is a range of bytes allocated on heap that has no references in the code and thus can't be freed. For example:
// ...
char* leak = malloc(10); // Local reference to heap
return; // reference lost
In Java such leaks are impossible, since any lost reference is subject to GC.
There are situations, however, that can result in Java code using more memory than it should. Your code represents one of many possible examples of such behavior. In your case, as it was explained in previous answers, some elements of the stack will remain in the heap just because the array is holding references to objects that are no longer needed. In GC environments this is usually called "Lingering objects". A good way to spot memory usage problems in Java is to check heap in use after GC. If heap usage after GC is consistently going up, you probably have lingering objects or other memory allocation problems. For example, if heap in use is 1M after 1st GC, 2M after 2nd GC and 3M after 3d GC - you should probably use a Java Memory Profiler to pinpoint the problem. Note that in your example heap usage will not go up between GCs, but it will not go down either. If you assign null to unused objects, the heap usage will go down after GC if the stack shrinks.

Java - JDBC Memory Use After Closing Result/Statement/Connection [duplicate]

I was browsing some old books and found a copy of "Practical Java" by Peter Hagger. In the performance section, there is a recommendation to set object references to null when no longer needed.
In Java, does setting object references to null improve performance or garbage collection efficiency? If so, in what cases is this an issue? Container classes? Object composition? Anonymous inner classes?
I see this in code pretty often. Is this now obsolete programming advice or is it still useful?

It depends a bit on when you were thinking of nulling the reference.
If you have an object chain A->B->C, then once A is not reachable, A, B and C will all be eligible for garbage collection (assuming nothing else is referring to either B or C). There's no need, and never has been any need, to explicitly set references A->B or B->C to null, for example.
Apart from that, most of the time the issue doesn't really arise, because in reality you're dealing with objects in collections. You should generally always be thinking of removing objects from lists, maps etc by calling the appropiate remove() method.
The case where there used to be some advice to set references to null was specifically in a long scope where a memory-intensive object ceased to be used partway through the scope. For example:
{
BigObject obj = ...
doSomethingWith(obj);
obj = null; <-- explicitly set to null
doSomethingElse();
}
The rationale here was that because obj is still in scope, then without the explicit nulling of the reference, it does not become garbage collectable until after the doSomethingElse() method completes. And this is the advice that probably no longer holds on modern JVMs: it turns out that the JIT compiler can work out at what point a given local object reference is no longer used.

No, it's not obsolete advice. Dangling references are still a problem, especially if you're, say, implementing an expandable array container (ArrayList or the like) using a pre-allocated array. Elements beyond the "logical" size of the list should be nulled out, or else they won't be freed.
See Effective Java 2nd ed, Item 6: Eliminate Obsolete Object References.

Instance fields, array elements
If there is a reference to an object, it cannot be garbage collected. Especially if that object (and the whole graph behind it) is big, there is only one reference that is stopping garbage collection, and that reference is not really needed anymore, that is an unfortunate situation.
Pathological cases are the object that retains an unnessary instance to the whole XML DOM tree that was used to configure it, the MBean that was not unregistered, or the single reference to an object from an undeployed web application that prevents a whole classloader from being unloaded.
So unless you are sure that the object that holds the reference itself will be garbage collected anyway (or even then), you should null out everything that you no longer need.
Scoped variables:
If you are considering setting a local variable to null before the end of its scope , so that it can be reclaimed by the garbage collector and to mark it as "unusable from now on", you should consider putting it in a more limited scope instead.
{
BigObject obj = ...
doSomethingWith(obj);
obj = null; // <-- explicitly set to null
doSomethingElse();
}
becomes
{
{
BigObject obj = ...
doSomethingWith(obj);
} // <-- obj goes out of scope
doSomethingElse();
}
Long, flat scopes are generally bad for legibility of the code, too. Introducing private methods to break things up just for that purpose is not unheard of, too.

In memory restrictive environments (e.g. cellphones) this can be useful. By setting null, the objetc don't need to wait the variable to get out of scope to be gc'd.
For the everyday programming, however, this shouldn't be the rule, except in special cases like the one Chris Jester-Young cited.

Firstly, It does not mean anything that you are setting a object to null. I explain it below:
List list1 = new ArrayList();
List list2 = list1;
In above code segment we are creating the object reference variable name list1 of ArrayList object that is stored in the memory. So list1 is referring that object and it nothing more than a variable. And in the second line of code we are copying the reference of list1 to list2. So now going back to your question if I do:
list1 = null;
that means list1 is no longer referring any object that is stored in the memory so list2 will also having nothing to refer. So if you check the size of list2:
list2.size(); //it gives you 0
So here the concept of garbage collector arrives which says «you nothing to worry about freeing the memory that is hold by the object, I will do that when I find that it will no longer used in program and JVM will manage me.»
I hope it clear the concept.

One of the reasons to do so is to eliminate obsolete object references.
You can read the text here.

Does variable = null set it for garbage collection

Help me settle a dispute with a coworker:
Does setting a variable or collection to null in Java aid in garbage collection and reducing memory usage? If I have a long running program and each function may be iteratively called (potentially thousands of times): Does setting all the variables in it to null before returning a value to the parent function help reduce heap size/memory usage?

That's old performance lore. It was true back in 1.0 days, but the compiler and the JVM have been improved to eliminate the need (if ever there was one). This excellent IBM article gets into the details if you're interested: Java theory and practice: Garbage collection and performance

From the article:
There is one case where the use of explicit nulling is not only helpful, but virtually required, and that is where a reference to an object is scoped more broadly than it is used or considered valid by the program's specification. This includes cases such as using a static or instance field to store a reference to a temporary buffer, rather than a local variable, or using an array to store references that may remain reachable by the runtime but not by the implied semantics of the program.
Translation: "explicitly null" persistent objects that are no longer needed. (If you want. "Virtually required" too strong a statement?)

The Java VM Spec
12.6.1 Implementing Finalization
Every object can be characterized by two attributes: it may be reachable, finalizer-reachable, or unreachable, and it may also be unfinalized, finalizable, or finalized.
A reachable object is any object that can be accessed in any potential continuing computation from any live thread. Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
Discussion
Another example of this occurs if the values in an object's fields are stored in registers. The program may then access the registers instead of the object, and never access the object again. This would imply that the object is garbage.
The object is reachable if it can be involved in any potential continuing computation. So if your code refers to a local variable, and nothing else refers to it, then you might cause the object to be collected by setting it to null. This would either give a null pointer exception, or change the behaviour of your program, or if it does neither you didn't need the variable in the first place.
If you are nulling out a field or an array element, then that can possibly make sense for some applications, and it will cause the memory to be reclaimed faster. Once case is creating a large array to replace an existing array referenced by a field in a class - if the field in nulled before the replacement is created, then it may relieve pressure on the memory.
Another interesting feature of Java is that scope doesn't appear in class files, so scope is not relevant to reachability; these two methods create the same bytecode, and hence the VM does not see the scope of the created object at all:
static void withBlock () {
int x = 1;
{
Object a = new Object();
}
System.out.println(x+1);
}
static void withoutBlock () {
int x = 1;
Object a = new Object();
System.out.println(x+1);
}

Not necessarily. An object becomes eligible for garbage collection when there are no live threads anymore that hold a reference to the object.
Local variables go out of scope when the method returns and it makes no sense at all to set local variables to null - the variables disappear anyway, and if there's nothing else that holds a reference the objects that the variables referred to, then those objects become eligible for garbage collection.
The key is not to look at just variables, but look at the objects that those variables refer to, and find out where those objects are referenced by your program.

It is useless on local variables, but it can be useful/needed to clear up instance variables that are not required anymore (e.g. post-initialization).
(Yeah yeah, I know how to apply the Builder pattern...)

That could only make some sense in some scenario like this:
public void myHeavyMethod() {
List hugeList = loadHugeListOfStuff(); // lots of memory used
ResultX res = processHugeList(hugeList); // compute some result or summary
// hugeList = null; // we are done with hugeList
...
// do a lot of other things that takes a LOT of time (seconds?)
// and which do not require hugeList
...
}
Here it could make some benefit to uncomment the hugeList = null line, I guess.
But it would certainly make more sense to rewrite the method (perhaps refactoring into two,
or specifying an inner scope).

Setting an object reference to null only makes it eligible for garbage collection.
It does not necessarily free up the memory,which depends on when the garbage collector runs(which depends on JVM).
When the garbage collector runs,it frees up the heap by deleting only the objects which are eligible for garbage collection.

It is a good to have. When you set objects to null, there is a possibility that the object can be garbage collected faster, in the immediate GC cycle. But there is no guaranteed mechanism to make an object garbage collected at a given time.

Explicit nulling

In what situations in java is explicit nulling useful. Does it in any way assist the garbage collector by making objects unreachable or something? Is it considered to be a good practice?

In Java it can help if you've got a very long-running method, and the only reference to the an object is via a local variable. Setting that local variable to null when you don't need it any more (but when the method is going to continue to run for a long time) can help the GC. (In C# this is very rarely useful as the GC takes "last possible use" into account. That optimization may make it to Java some time - I don't know.)
Likewise if you've got a member field referring to an object and you no longer need it, you could potentially aid GC by setting the field to null.
In my experience, however, it's rarely actually useful to do either of these things, and it makes the code messier. Very few methods really run for a long time, and setting a variable to null really has nothing to do with what you want the method to achieve. It's not good practice to do it when you don't need to, and if you do need to you should see whether refactoring could improve your design in the first place. (It's possible that your method or type is doing too much.)
Note that setting the variable to null is entirely passive - it doesn't inform the garbage collector that the object can be collected, it just avoids the garbage collector seeing that reference as a reason to keep the object alive next time it (the GC) runs.

In general it isn't needed (of course that can depend on the VM implementation). However if you have something like this:
private static final Map<String, String> foo;
and then have items in the map that you no longer need they will not be eligible for garbage collection so you would need to explicitly remove them. There are many cases like this (event listeners is another area that this can happen with).
But doing something like this:
void foo()
{
Object o;
// use o
o = null; // don't bother doing this, it isn't going to help
}
Edit (forgot to mention this):
If you work at it, you should find that 90-95% of the variables you declare can be made final. A final variable cannot change what it points at (or what its value is for primitives). In most cases where a variable is final it would be a mistake (bug) for it to receive a different value while the method is executing.
If you want to be able to set the variable to null after use it cannot be final, which means that you have a greater chance to create bugs in the code.

One special case I found it useful is when you have a very large object, and want to replace it with another large object. For example, look at the following code:
BigObject bigObject = new BigObject();
// ...
bigObject = new BigObject(); // line 3
If an instance of BigObject is so large that you can have only one such instance in the heap, line 3 will fail with OutOfMemoryError, because the 1st instance cannot be freed until the assignment instruction in line 3 completes, which is obviously after the 2nd instance is ready.
Now, if you set bigObject to null right before line 3:
bigObject = null;
bigObject = new BigObject(); // line 3
the 1st instance can be freed when JVM runs out of heap during the construction of the 2nd instance.

From "Effective Java" : use it to eliminate obsolete object references. Otherwise it can lead to memory leaks which can be very hard to debug.
public Object pop(){
if(size == 0)
throw new EmptyStatckException();
Object result = elements[--size];
elements[size] = null; //Eliminate Object reference
return result;
}

If you are nulling an object that is about to go out of scope anyway when your method block closes, then there is no benefit whatsoever in terms of garbage collection. It is not unusual to encounter people who don't understand this who work really hard to set a lot of things to null needlessly.

Explicit nulling can help with GC in some rare situations where all of the following are true:
The variable is the only (non-weak) reference to the object
You can guarantee that the object will no longer be needed
The variable will stay in scope for an extended period of time (e.g. it is a field in a long-lived object instance)
The compiler is unable to prove that the object is no longer used, but you are able to guarantee this though your superior logical analysis of the code :-)
In practice this is quite rare in good code: if the object is no longer needed, you should normally be declaring it in a narrower scope anyway. For example, if you only need the object during a single invocation of a method, it should be a local variable, not a field in the enclosing object.
One situation where explicit nulling is genuinely useful: if null is used to indicate a specific state then setting to a null value is sometimes going to be necessary and useful. Null is a useful value in itself for a couple of reasons:
Null checks are extremely fast, so conditional code that checks for null is typically more efficient than many alternatives (e.g. calling object.equals())
You get an immediate NullPointerException if you try to dereference it. This is useful because it is good Fail Fast coding style that will help you to catch logic errors.

See also WeakReference in J2SE.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.