can any unused object escape from Garbage Collector? - java

Is there any possibility that a object which is not referenced anywhere and still existing on heap. I mean is there a possibility that a unused object getting escaped from garbage collector and be there on the heap until the end of the application.
Wanted to know because if it is there, then while coding i can be more cautious.

If an object is no longer referenced, it does still exist on the heap, but it is also free to be garbage-collected (unless we are talking Class objects, which live in PermGen space and never get garbage-collected - but this is generally not something you need to worry about).
There is no guarantee on how soon that will be, but your application will not run out of memory before memory from those objects is reclaimed.
However, garbage collection does involve overhead, so if you are creating more objects than you need to and can easily create less, then by all means do so.
Edit: in response to your comment, if an object is truly not referenced by anything, it will be reclaimed during garbage collection (assuming you are using the latest JVM from Sun; I can't speak toward other implementations). The reason why is as follows: all objects are allocated contiguously on the heap. When GC is to happen, the JVM follows all references to "mark" objects that it knows are reachable - these objects are then moved into another, clean area. The old area is then considered to be free memory. Anything that cannot be found via a reference cannot be moved. The point is that the GC does not need to "find" the unreferenced objects. If anything, I would be more worried about objects that are still referenced when they are not intended to be, which will cause memory leaks.

You should know that, before a JVM throws an out-of-memory exception, it will have garbage collected everything possible.

If an instance is no longer referenced, it is a possible candidate for garbage collection. This means, that sooner or later it can be removed but there are no guaranties. If you do not run out of of memory, the garbage collector might not even run, thus the instance my be there until the program ends.
The CG system is very good at finding not referenced objects. There is a tiny, tiny chance that you end up keeping a weird mix of references where the garbage collector can not decide for sure if the object is no longer referenced or not. But this would be a bug in the CG system and nothing you should worry about while coding.

It depends on when and how often the object is used. If you allocate something then deallocate (i.e., remove all references to it) it immediately after, it will stay in "new" part of the heap and will probably be knocked out on the next garbage collection run.
If you allocate an object at the beginning of your program and keep it around for a while (if it survives through several garbage collections), it will get promoted to "old" status. Objects in that part of the heap are less likely to be collected later.
If you want to know all the nitty-gitty details, check out some of Sun's gc documentation.

Yes; imagine something like this:
Foo foo = new Foo();
// do some work here
while(1) {};
foo.someOp(); // if this is the only reference to foo,
// it's theoreticaly impossible to reach here, so it
// should be GC-ed, but all GC systems I know of will
// not Gc it
I am using definition of: garbage = object that can never be reached in any execution of the code.

Garbage collection intentionally makes few guarantees about WHEN the objects are collected. If memory never gets too tight, it's entirely possible that an unreferenced object won't be collected by the time the program ends.

The garbage collector will eventually reclaim all unreachable objects. Note the "eventually": this may take some time. You can somewhat force the issue with System.gc() but this is rarely a good idea (if used without discretion, then performance may decrease).
What can happen is that an object is "unused" (as in: the application will not use it anymore) while still being "reachable" (the GC can find a path of references from one of its roots -- static fields, local variables -- to the object). If you are not too messy with your objects and structures then you will not encounter such situations. A rule of thumb would be: if the application seems to take too much RAM, run a profiler on it; if thousands of instances of the same class have accumulated without any apparent reason, then there may be some fishy code somewhere. Correction often involves explicitly setting a field to null to avoid referencing an object for too long.

This is theoretically possible (there is no guarantee the GC will always find all objects), but should not worry you for any real application - it usually does not happen and certainly does not affect a significant chunk of memory.

In theory, the garbage collector will find all unused objects. There could, of course, be bugs in the garbage collector…
That said, "In theory there is no difference between theory and practice, in practice, there is." Under some, mostly older, garbage collectors, if an object definition manages to reach the permanent generation, then it will no longer be garbage collected under any circumstances. This only applied to Class definitions that were loaded, not to regular objects that were granted tenured status.
Correspondingly, if you have a static reference to an object, that takes up space in the "regular" object heap, this could conceivably cause problems, since you only need to hold a reference to the class definition from your class definition, and that static data cannot be garbage collected, even if you don't actually refer to any instances of the class itself.
In practice though, this is a very unlikely event, and you shouldn't need to worry about it. If you are super concerned about performance, then creating lots of "long-lived" objects, that is, those that escape "escape-analysis", will create extra work for the garbage collector. For 99.99% of coders this is a total non-issue though.

My advice - Don't worry about it.
Reason - It is possible for a non-referenced object to stay on the heap for some time, but it is very unlikely to adversely affect you because it is guaranteed to be reclaimed before you get an out of memory error.

In general, all objects to which there are no live hard references, will be garbage-collected. This is what you should assume and code for. However, the exact moment this happens is not predictable.
Just for completeness, two tricky situations [which you are unlikely to run into] come into my mind:
Bugs in JVM or garbage collector code
So called invisible references - they rarely matter but I did have to take them into account one or two times during the last 5 years in a performance-sensitive application I work on

Related

Resource Handling Practice

It's assured that Garbage Collector destroys all the unwanted and unused objects,
what if we manually nullify the objects eg. List<String> = null ,
does this action makes any negative or positive performance effect?
I am on Java.
Thanks.
Not an expert on details of memory handling but I can share what I know. GC will collect whatever is not used. Thus when you eliminate the last reference to an object (by explicitly nullifying) you'll be marking it for garbage collection. This does not guarantee that it'll be collected immediately.
You can explicitly try and invoke GC but you'll see lots of people advising against it. My understanding is that the call to GC is unreliable at best. The whole point with GC and Java is that you as a programmer should not need to worry much about the memory allocation. As for performance, unless you have tight limitations for heap space, you shouldn't notice GC activity.
Garbage collection is a way in which Java recollects the space occupied by loitering objects. By doing so, it [Java] ensures that your application never runs out of memory (though we cannot be assured that the program will ever run out of memory).
It is suggested to leave it on JVM.
Read related : Does setting Java objects to null do anything anymore?
Explicit nulling makes little or no difference. Usually the GC can reliably detect when an object can no longer be reached, and can thus be GCd.
Particularly, nulling stack (i.e. inside methods) variables helps absolutely nothing. It's trivial to for the runtime to automatically detect when they will be needed and when not. nulling heap (i.e. inside classes) variables could in some rare instances help, but that's a rare exception, and probably does more harm (in code legibility/maintainability) than good.
Also note that nulling doesn't guarantee if, or when, an object will be GCd.

what's more efficient? to empty an object or create a new one?

how expensive is 'new'? I mean, should I aim at reusing the same object or if the object is 'out of scope' it's the same as emptying it?
example, say a method creates a list:
List<Integer> list = new ArrayList<Integer>();
at the end of the method the list is no longer in use - does it mean that there's no memory allocated to it anymore or does it mean that there's a null pointer to it (since it was 'created').
Alternately, I can send a 'list' to the method and empty it at the end of the method with: list.removeAll(list); will that make any difference from memory point of view?
Thanks!
its an array list, so creating a new object means allocating a slab of memory and zeroing it, plus any bookkeeping overhead. Clearing the list means zeroing the memory. This view would lead you to believe that clearing an existing object is faster. But, it's likely that the JVM is optimized to make memory allocations fast, so probably none of this matters. So just write clear, readable code, and don't worry about it. This is java after all, not c.
at the end of the method the list is no longer in use - does it mean that there's no memory allocated to it anymore or does it mean that there's a null pointer to it (since it was 'created').
Means there are no references to it and object is eligible for GC.
Alternately, I can send a 'list' to the method and empty it at the end of the method with: list.removeAll(list); will that make any difference from memory point of view?
It's tradeoff between time/space. Removing elements from list is time consuming, even though you don't need to create new objects.
With the latest JVMs GC collection capabilities, it is ok to create new object WHEN REQUIRED (but avoiding object creation in loop is best). Longer references to an object sometimes make that object NOT eligible for GC and may cause memory leak if not handled properly.
I don't know much about memory footprints in java, but I think emptying a List to reuse it, is not such a good idea because of the performance impact of emptying the List. And I think it is also in an OO perspective not a good idea, because you should have one object with just one purpose.
At the end of a method the object is indeed out of scope. But that doesn't mean it is garbage collected or even eligible for garbage collection, because others might still reference that List. So basically: if there are no objects references to that List then it might be elegible for garbage collection, but if it will be garbage collected it still unsure, if the List is still stored in the Young Generation space it can either be in the Eden space or Tenured space.
Eden space is where objects are first allocated, when garbage collection happens and the object is still alive it will be moved to survivor space. If it still survives past that it will move on to the Tenured space, where I believe not much garbage collection happens. But all this depends how long an object lives, who refers to this object and where it is allocated
how expensive is 'new'?
It definitely incurs some overhead. But it depends on how complex the object is. If you are creating an object with just few primitives, not that expensive. But if you are creating objects inside objects, may be collections of objects, if your constructor is reading some properties file to initialize object's member variables, EXPENSIVE!
But to be frank, if we need to create a new object, we have create it, there is no alternative. And if we don't need to and if we are still creating that is kind of bad programming.
at the end of the method the list is no longer in use - does it mean
that there's no memory allocated to it anymore or does it mean that
there's a null pointer to it (since it was 'created').
Once the object does not have any reference to it, it becomes out of scope, and it becomes eligible for garbage collection. Hence even if it has some memory allocated, it will be reclaimed by the GC at some later point, whenever it runs, we need not worry about it. (And we cannot guarantee when will GC run).
Emptying the collection at the end, I don't think will make things any better, because the same thing will happen to all the individual objects in the collection, as what happens to the collection itself. They will become eligible for GC.
For small lists, it is probably a bit cheaper cheaper to clear() the list.
For the asymptotic case of really large lists in a really large heap, it boils down to whether the GC can zero a large chunk of memory faster than the for loop in clear() can. And I think it probably can.
However, my advice would be to ignore this unless you have convincing evidence (from profiling) that you have a high turn-over of ArrayList objects. (It is a bad idea to optimize based solely on your intuition.)
It depends on how costly the object is, both in terms of initialization required and how large it's memory footprint is. It also depends heavily on the kind of application (what else does the application spend time on).
For your example with the ArrayList, its already very hard to give a definite answer - depending on how many entries there are in the list, clear() can be very expensive or very cheap, while a new ArrayList has almost constant cost.
The general rule of thumb is: Don't bother with reusing objects until you have measured that you have a performance problem, and then be very sure that creating the objects is the cause of that problem. Most likely there are more rewarding optimization opportunities in your application. A profiler will help identify the places where you spend the most time. Focus on those and better algoryhtms.

Java "dead" objects not being garbage collected

I know that during garbage collection in Java, objects that don't have any more references to them are marked as "dead" so that they can be deleted from memory by the garbage collector.
My question is if, during a garbage collection phase, all of the "dead" objects get deleted from memory or some of them survive? Why would a "dead" object survive a garbage collection phase?
LATER EDIT
Thank you for all of your answers. I can deduce that the main reason why "dead" objects would not be deleted is due to timing or spacing limitations of the way the Garbage Collector operates.
However, supposing that the Garbage Collector can reach all of the "dead" objects, I was wondering if there is a way to declare, reference, use, dereference, etc.. an object such that somehow it would skip the deletion phase even though it is "dead". I was thinking maybe objects belonging to classes which have static methods or inner classes or something like that may be kept in memory for some reason, even though they have no references to them.
Is such a scenario possible?
Thank you
My question is if, during a garbage collection phase, all of the "dead" objects get deleted from memory or some of them survive? Why would a "dead" object survive a garbage collection phase?
All current HotSpot GCs are generational collectors. Quoting from Wikipedia:
"It has been empirically observed that in many programs, the most recently created objects are also those most likely to become unreachable quickly (known as infant mortality or the generational hypothesis). A generational GC (also known as ephemeral GC) divides objects into generations and, on most cycles, will place only the objects of a subset of generations into the initial white (condemned) set. Furthermore, the runtime system maintains knowledge of when references cross generations by observing the creation and overwriting of references. When the garbage collector runs, it may be able to use this knowledge to prove that some objects in the initial white set are unreachable without having to traverse the entire reference tree. If the generational hypothesis holds, this results in much faster collection cycles while still reclaiming most unreachable objects."
What this means for your question is that most GC cycles collect only garbage objects in young generations. A garbage object in the oldest generation can survive multiple GC cycles ... until the old generation is finally collected. (And in the new G1 GC, apparently the old generation is collected a bit at a time ... which can delay reclamation even further.)
Other causes for (notionally) unreachable objects to survive include:
Unreachable objects with (unexecuted) finalizers are attached to a finalization queue by the garbage collector for processing after the GC has finished.
Objects that are softly, weakly or phantom referenced are actually still reachable, and are handled by their respective reference queue managers after the GC has finished.
Objects that are reachable by virtue of JNI global references, etcetera. (thanks #bestss)
Various hidden references exist that relate instances, their classes and their classloaders.
There is a hidden reference from an inner instance to its outer instance.
There is a hidden reference from a class to the intern'd String objects that represent its string literals.
However, these are all consequences of the definition of reachability:
"A reachable object is any object that can be accessed in any potential continuing computation from any live thread." - JLS 12.6.1
It is also worth noting that the rules for the GC have an element of conservativeness about them. They say that a reachable object won't be deleted, but they don't say that an object that is (strictly) unreachable will be deleted. This allows for cases where an object cannot be accessed but the runtime system is unable to figure that out.
Your followup question:
However, supposing that the Garbage Collector can reach all of the "dead" objects, I was wondering if there is a way to declare, reference, use, dereference, etc.. an object such that somehow it would skip the deletion phase even though it is "dead".
"Dead" is not a well-defined term. If the garbage collector can reach the objects, they are by definition reachable. They will not be deleted while they are still reachable.
If they are both dead AND reachable (whatever "dead" means!) then the fact that they are reachable means they won't be deleted.
What you are proposing doesn't make sense.
I was thinking maybe objects belonging to classes which have static methods or inner classes or something like that may be kept in memory for some reason, even though they have no references to them. Is such a scenario possible?
Static methods don't have references ... unless they happen to be on the call stack. Then the local variables may contain references just like any other method call. Normal reachability rules apply.
Static fields are GC roots, for as long as the class itself exists. Normal reachability rules apply.
Instances of inner classes are no different to instance of other classes from a GC perspective. There can be a reference to an outer class instance in an inner class instance, but that leads to normal reachability.
In summary, there are some unexpected "causes" for reachability, but they are all a logical consequence of the definition of reachability.
As the System.gc() javadoc says
When control returns from the method
call, the Java Virtual Machine has
made a best effort to reclaim space
from all discarded objects.
From which you can infer that a call to the garbage collector does not insure that all unused object will be reclaimed. As the garbage collection can completely differ between implementation, no definitive answer can be given. There is even java implementations without any garbage collection.
One potential explanation for an unreachable object not being collected is time. As of Java 1.5 the amount of time the JVM spends garbage collecting can be limited using on of the following options...
-XX:MaxGCPauseMillis
-XX:GCTimeRatio=<nnn>
Both options are explained in detail here
There are dead objects in "young" generation and there are dead objects in "old" generation. If GC being performed in "minor GC", only dead objects from young generation will be collected.
Additionally, you can use finalize() method to stop VM from collecting your object by throwing exception from finalize() (at least, this is how I understand Object.finalize() javadoc: Any exception thrown by the finalize method causes the finalization of this object to be halted, but is otherwise ignored).
The behaviour of the garbage collector is not fully specified. If a particular implementation choose not to collect certain objects, it is allowed to do so. This could be done to avoid spending large periods of time in the garbage collector, which could have detrimental effects to the operation of the application.
Imagine you had a collection which contained millions of small objects, most of which were not referenced anywhere else. If the only references to that collection was cleared, would you want the GC to spend a long time cleaning out those millions of small objects, or would you want it to do so over the course of several calls? In most cases, the latter would be better for the application.

When does Java's garbage collection free a memory allocation?

I have created an object in Java, Named FOO. FOO contains a large amount of data.. I don't know say for a ten mega byte text file that I have pulled into ram for manipulation.(This is just an example)
This is clearly a huge amount of space and I want to deallocate it from memory. I set FOO to NULL.
Will this free up that space in memory automatically?
or
Will the memory taken by the loaded text file be around until automatic garbage collection?
When you set the reference of any object to null, it becomes available for garbage collection. It still occupies the memory until the garbage collector actually runs. There are no guarantees regarding when GC will run except that it will definitely run and reclaim memory from unreachable objects before an OutOfMemoryException is thrown.
You can call System.gc() to request garbage collection, however, that's what it is - a request. It is upto GC's discretion to run.
Using a WeakReference can help in some cases. See this article by Brian Goetz.
Actually the object is not named FOO. FOO is the name of a variable which is not the object; the variable contains a reference to the object. There could be several distinct variables containing references to the same object.
The garbage collector works by automatically detecting unreachable objects: these are objects which the application cannot use anymore because it has irretrievably forgotten where they are (the application may possibly access any object for which it has a reference to, including the references stored in field in objects it can access, and so on).
When you set FOO = null, assuming that FOO contained at that point the last reachable reference to the object, then the memory is released immediately, in the following sense: at the very clock cycle at which null is set in FOO, the object becomes unreachable. Therefore, the garbage collector will notice that unreachable object and reclaim the corresponding memory block; that is, the GC will do that the next time it can be bothered to run. Of course, the actual bits which constitute the object may linger a bit in memory; but that block is nonetheless "free" since the memory allocator will automatically run the GC when free memory is tight. From the application point of view, the object is as good as dead and the corresponding memory is free since that memory will be reused the next time the application needs it. The whole thing is automatic.
Things are a bit more complex with regards to the operating system. If an unreachable object is free memory from the application point of view, it is still, as far as the OS is concerned, a block of RAM dedicated to the running process. That block of RAM may be given back to the OS only when the GC (which is, at the OS level, a part of the process) actually runs, notices that the object is unreachable, and condescends to give the block back to the OS. When the GC runs heavily depends on the GC technology and how the application allocates objects; also, some GC will never give back the block the OS at all (the GC knows that the block it free, the memory allocator will reuse it at will, but not other processes).
System.gc() is a hint to the VM, so that it runs the GC now. Formally, it is only a hint, and the VM is free to ignore it. In practice, it runs the GC, unless the VM was instructed not to obey such commands (with Sun's JVM, this is a matter of a specific command-line flag). Even if the GC runs, it does not necessarily give back the memory to the operating system. System.gc() is not terribly useful.
Setting foo = null; does not mean that foo will be garbage collected immediately. Instead, it will be collected when the GC next runs, if it can be. When foo is collected, any objects for which it holds the sole reference will also be eligible for collection and therefore collected.
Note that even calling System.gc() does not guarantee that that JVM will do it right away.
System.gc() is just a request and there is no guarantee that it's effect immediately.
There's no guarantee that JVM will do it right away, you can try to force it by using System.gc()
The garbage collector will free the memory after you "destroy" the reference. i.3 Setting the object reference to null. You can use forced garbage collection option but you should use it with care. The Garbage collector is designed to use an optimized schedule so calling the System.gc() may ruin the rhythem and possibly have less performance due to unnecessary task switching.
Alternatively you can think about a way that allows you to not to load large amounts of data into memory. If you can gain that by improving your code that would be much better.

How to cause soft references to be cleared in Java?

I have a cache which has soft references to the cached objects. I am trying to write a functional test for behavior of classes which use the cache specifically for what happens when the cached objects are cleared.
The problem is: I can't seem to reliably get the soft references to be cleared. Simply using up a bunch of memory doesn't do the trick: I get an OutOfMemory before any soft references are cleared.
Is there any way to get Java to more eagerly clear up the soft references?
Found here:
"It is guaranteed though that all
SoftReferences will get cleared before
OutOfMemoryError is thrown, so they
theoretically can't cause an OOME."
So does this mean that the above scenario MUST mean I have a memory leak somewhere with some class holding a hard reference on my cached object?
The problem is: I can't seem to
reliably get the soft references to be
cleared.
This is not unique to SoftReferences. Due to the nature of garbage collection in Java, there is no guarantee that anything that is garbage-collectable will actually be collected at any point in time. Even with a simple bit of code:
Object temp = new Object();
temp = null;
System.gc();
there is no guarantee that the Object instantiated in the first line is garbage collected at this, or in fact any point. It's simply one of the things you have to live with in a memory-managed language, you're giving up declarative power over these things. And yes, that can make it hard to definitively test for memory leaks at times.
That said, as per the Javadocs you quoted, SoftReferences should definitely be cleared before an OutOfMemoryError is thrown (in fact, that's the entire point of them and the only way they differ from the default object references). It would thus sound like there is some sort of memory leak in that you're holding onto harder references to the objects in question.
If you use the -XX:+HeapDumpOnOutOfMemoryError option to the JVM, and then load the heap dump into something like jhat, you should be able to see all the references to your objects and thus see if there are any references beside your soft ones. Alternatively you can achieve the same thing with a profiler while the test is running.
There is also the following JVM parameter for tuning how soft references are handled:
-XX:SoftRefLRUPolicyMSPerMB=<value>
Where 'value' is the number of milliseconds a soft reference will remain for every free Mb of memory. The default is 1s/Mb, so if an object is only soft reachable it will last 1s if only 1Mb of heap space is free.
You can force all SoftReferences to be cleared in your tests with this piece of code.
If you really wanted to, you can call clear() on your SoftReference to clear it.
That said, if the JVM is throwing an OutOfMemoryError and your SoftReference has not been cleared yet, then this means that you must have a hard reference to the object somewhere else. To do otherwise would invalidate the contract of SoftReference. Otherwise, you are never guaranteed that the SoftReference is cleared: as long as there is still memory available, the JVM does not need to clear any SoftReferences. On the other hand, it is allowed to clear them next time it does a GC cycle, even if it doesn't need to.
Also, you can consider looking into WeakReferences since the VM tends to be more aggressive in clear them. Technically, the VM isn't ever required to clear a WeakReference, but it is supposed to clean them up next time it does a GC cycle if the object would otherwise be considered dead. If your are trying to test what happens when your cache is cleared, using WeakReferences should help your entries go away faster.
Also, remember that both of these are dependent on the JVM doing a GC cycle. Unfortunately, there is no way to guarantee that one of those ever happens. Even if you call System.gc(), the garbage collector may decide that it is doing just peachy and choose to do nothing.
In a typical JVM implementation (SUN) you need to trigger a Full GC more than once to get the Softreferences cleaned. The reason for that is because Softreferences require the GC to do more work, because for example of a mechanism that allows you to get notified when the objects are reclaimed.
IMHO using a lot of sofreferences in an application server is evil, because the developer has not much control over when they are released.
Garbage collection and other references like soft references are non deterministic this it's not really possible to reliable do stuff so that soft references are definitely cleared at that point so your test can judge how yourcache reacts. I would suggest you simulate the reference clearing in more definite way by mocking etc - your tests will be reproducable and more valuable rather than just Hopi g for the GC to clean up references. Using the latter approach is a really bad thing to do and willjust introduce additional problems rather than help you improve the quality of your cache and it's collaborating components.
From the documentation and my experience I'd say yes: you must have a reference somewhere else.
I'd suggest using a debugger that can show you all references to an object (such as Eclipse 3.4 when debugging Java 6) and just check when the OOM is thrown.
If you use eclipse, there is this tool named Memory Analyzer that makes heap dump debugging easier.
Does the cached object have a finalizer? The finalizer will create new strong references to the object, so even if the SoftReference is cleared the memory will not be reclaimed until a later GC cycle
If you have a cache which is a Map of SoftReferences and you want them cleared you can just clear() the map and they will all be cleaned up (including their references)

Categories

Resources