Let's consider the following 2 cyclic referencing examples:
Straight forward cyclic referencing
class A {
B b;
}
class B {
A a;
}
WeakReferenceing
class A {
B b;
}
class B {
WeakReference<A> aRef;
}
The following SO question answered by #Jon Skeet makes clear that the straight forward example will also be garbage collected as long as no "GC walk" from a known root exists to the cycle.
My question is as follows:
Is there any reason performance or otherwise to use or not to use the idiom represented in example 2 - the one employing a WeakReference?
Is there any reason performance or otherwise to use or not to use the idiom represented in example 2
The Java Reference types have a couple of performance implications:
They use more space than regular references.
They are significantly more work for the garbage collector than ordinary references.
I also believe that they can cause the collection of objects to be delayed by one or more GC cycles ... depending on the GC implementation.
In addition the application has to deal with the possibility that a WeakReference may be broken
By contrast, there are no performance or space overheads for normal cyclic references as you use them in your first example.
In summary, your weak reference idiom reduces performance and increases program complexity ... with no tangible benefits that I can see.
My guess is that this Question derives from the mistaken notion that cyclic references are more expensive than non-cyclic references in Java ... or that they are somehow problematic. (What other logical reason would cause one to propose an "idiom" like this?) In fact, this is not the case. Java garbage collectors don't suffer from the problems of reference counting; e.g. C++ "smart pointers". Cyclic references are handled correctly (i.e. without leaking memory) and efficiently in Java.
The problem is you do not know when GC will clear the weakreference objects.
It may be cleared just as you declare it! GC is very eager to collect it.
Or you can have root reference to the weakreference object to prevent it from the garbage collection.
Or check its status through RegisteredQueue.
It's like finalize method. You do not know when GC will execute this method.
Sources:
http://pawlan.com/monica/articles/refobjs/
http://docs.oracle.com/javase/7/docs/api/java/lang/ref/WeakReference.html
Related
I'm looking for a way to delete an object in Java, make it eligible for GC. I have a Java class that needs to have its delete() method called.
public class SomeObj {
// some implementation stuff
//...
void delete() {
//remove yourself from some lists in the program
//...
this = null; // <- this line is illegal
//delete this; <- if I was in C++, I could do this
}
}
How should I do this? Apparently, I'm going to have to refactor my code because this smells like bad design.
For better or worse, Java is a language that runs in a garbage-collecting environment. An object has some kind of existence in an application so longer as it is reachable via references. Once it is no longer reachable -- when no other object holds a reference to it -- it is "deleted" so far as the application is concerned.
That the object still has some after-life in the heap is a matter for the garbage collector, not the application. An application that depends on being able to control the existence of objects to which there are no references is broken in some logical sense.
The usual, semi-legitimate reason for wanting to nudge an unreferenced object out of the heap for good is to conserve heap space. There have been many, many occasions when I've known when an object is really finished with better than the garbage collector ever could. Objects that store temporary results with method scope are a good example. I'm primarily a C and C++ developer, and I really want a method on java.lang.Object called ImDoneWithYouNow(). Sadly, it doesn't exist, and we have to rely on the GC implementation to take care of memory management.
You don't need (and really shouldn't have) a "destructor". Once no other object references the object in question, it becomes eligible for garbage collection, and will be removed by the garbage collector when it sees fit.
In the following example there are two functionally equivalent methods:
public class Question {
public static String method1() {
String s = new String("s1");
// some operations on s1
s = new String("s2");
return s;
}
public static String method2() {
final String s1 = new String("s1");
// some operations on s1
final String s2 = new String("s2");
return s2;
}
}
however in first(method1) of them string "s1" is clearly available for garbage collection before return statement. In second(method2) string "s1" is still reachable (though from code review prospective it's not used anymore).
My question is - is there anything in jvm spec which says that once variable is unused down the stack it could be available for garbage collection?
EDIT:
Sometimes variables can refer to object like fully rendered image and that have impact on memory.
I'm asking because of practical considerations. I have large chunk of memory-greedy code in one method and thinking if I could help JVM (a bit) just by splitting this method into few small ones.
I really prefer code where no reassignment is done since it's easier to read and reason about.
UPDATE: per jls-12.6.1:
Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner
So it looks like it's possible for GC to claim object which still visible. I doubt, however that this optimisation is done during offline compilation (it would screw up debugging) and most likely will be done by JIT.
No, because your code could conceivably retrieve it and do something with it, and the abstract JVM does not consider what code is coming ahead. However, a very, very, very clever optimizing JVM might analyze the code ahead and find that there is no way s1 could ever be referenced, and garbage collect it. You definitely can't count on this, though.
If you're talking about the interpreter, then in the second case S1 remains "referenced" until the method exits and the stack frame is rolled up. (That is, in the standard interpreter -- it's entirely possible for GC to use liveness info from method verification. And, in addition (and more likely), javac may do its own liveness analysis and "share" interpreter slots based on that.)
In the case of the JITC, however, an even mildly optimizing one might recognize that S1 is unused and recycle that register for S2. Or it might not. The GC will examine register contents, and if S1 has been reused for something else then the old S1 object will be reclaimed (if not otherwise referenced). If the S1 location has not been reused then the S1 object might not be reclaimed.
"Might not" because, depending on the JVM, the JITC may or may not provide the GC with a map of where object references are "live" in the program flow. And this map, if provided, may or may not precisely identify the end of the "live range" (the last point of reference) of S1. Many different possibilities.
Note that this potential variability does not violate any Java principles -- GC is not required to reclaim an object at the earliest possible opportunity, and there's no practical way for a program to be sensitive to precisely when an object is reclaimed.
VM is free to optimized the code to nullify s1 before method exit (as long as it's correct), so s1 might be eligible for garbage earlier.
However that is hardly necessary. Many method invocations must have happened before the next GC; all the stack frames have been cleared anyway, no need to worry about a specific local variable in a specific method invocation.
As far as Java the language is concerned, garbages can live forever without impact program semantics. That's why JLS hardly talks about garbage at all.
in first of them string "s1" is clearly available for garbage collection before return statement
It isn't clear at all. I think you are confusing 'unused' with 'unreachable'. They aren't necessarily the same thing.
Formally speaking the variable is live until its enclosing scope terminates, so it isn't available for garbage collection until then.
However "a Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner" JLS #12.6.1.
Basically stack frames and static area are considered as roots by GC. So if object is referenced from any stack frame its considered alive. The problem with reclaiming some objects from active stack frame is that GC works in parallel with application(mutator). How do you think GC should find out that object is unused while method is in progress? That would require a synchronization which would be VERY heavy and complex, in fact this will break the idea of GC to work in parallel with mutator. Every thread might keep variables in processor registers. To implement your logic, they should also be added to GC roots. I cant even imagine how to implement it.
To answer you question. If you have any logic which produces a lot of objects which are unused in the future, separate it to a distinct method. This is actually a good practice.
You should also take int account optimizations by JVM(like EJP pointed out). There is also an escape analysis, which might prevent object from heap allocation at all. But rely your codes performance on them is a bad practice
In objective C, there is a chance that two different references can point to each other.
But is this possible in Java? I mean, can two object references point to each other? If it's possible, when are they going to be garbage collected?
And, In case of nested classes, two objects (inner class's and outer class's) are linked to each other - how are these objects garbage collected?
I assume you are talking about circular references . Java's GC considers objects "garbage" if they aren't reachable through a chain starting at a GC root. Even though objects may point to each other to form a cycle, they're still eligible for GC if cut off from the root.
There are four kinds of GC roots in Java:
Local variables are kept alive by the stack of a thread. This is not a real object virtual reference and thus is not visible. For all intents and purposes, local variables are GC roots.
Active Java threads are always considered live objects and are therefore GC roots. This is especially important for thread local variables.
Static variables are referenced by their classes. This fact makes them de facto GC roots. Classes themselves can be garbage-collected, which would remove all referenced static variables. This is of special importance when we use application servers, OSGi containers or class loaders in general.
JNI References are Java objects that the native code has created as part of a JNI call. Objects thus created are treated specially because the JVM does not know if it is being referenced by the native code or not. Such objects represent a very special form of GC root.
You can also read here for more information.
Yes, you can do this. Like this:
class Pointy {
public Pointy other;
}
Pointy one = new Pointy();
Pointy two = new Pointy();
one.other = two;
two.other = one;
They're garbage collected when both objects are not pointed at by anything other than one another, or other objects which are "unreachable" from current running code. The Java garbage collectors are "tracing" garbage collectors, which means they can discover this sort of issue.
Conversely, reference-counted systems (like Objective C without its "modern" garbage collection -- I don't know what the default is) cannot normally detect this sort of issue, so the objects can be leaked.
Of course you can have objects reference each other. You could simply pass the this pointer in both objects to each other, which is perfectly valid.
However, that doesn't mean that the objects are still accessible from the GC root. Think of it as a (graph) tree. If you cut off a complete branch from the trunk, the whole branch is lost, no matter how many objects are involved or are maintaing references to each other.
Consider the following scenario.
You are building a class, in java, where the fundamental semantics of the class demand that no two instances of the class be equal in value unless they are in fact the same object (see instance-controlled classes in Effective Java by Joshua Bloch). In a sense this is like a very large enum (possibly hundreds of millions of "constants") that are not known until runtime. So, to recap, you want the class to ensure that that there are no "equal" instances on the heap. There may be lots of references to a particular object on the heap, but no extraneous equal objects. This can obviously be done in code but it seems to me that there is a major flaw that I have not seen addressed anywhere, including in Effective Java. It seems to me that in order to make this guarantee the instance-controlled class must keep a reference to every instance of itself that has EVER been created at any point during program execution and can NEVER "delete" one of those objects because it can never know that there are no longer any "pointers" to that object (besides the one that it itself keeps). In other words, if you think about this in the context of reference-counting, there will come some point in the program where the only reference to the object is the one held by the class itself (the one that says, "this was created at some point"). At that point you would like to release the memory associated with the object, but you can't because that one pointer that is left has no way of knowing that it is the last one.
Is there a good approach to providing instance-controlled classes which can also free no-longer-needed memory?
Update: So, I think I've found something that might help. It turns out java has a java.lang.ref class that provides weak references. From wikipedia: "A WeakReference is used to implement weak maps. An object that is not strongly or softly reachable, but is referenced by a weak reference is called "weakly reachable". A weakly reachable object is garbage collected in the next collection cycle. This behavior is used in the class java.util.WeakHashMap. A weak map allows the programmer to put key/value pairs in the map and not worry about the objects taking up memory when the key is no longer reachable anywhere else. Another possible application of weak references is the string intern pool. Semantically, a weak reference means "get rid of this object when nothing else references it at the next garbage collection."
You need to use one of the special reference objects, like a weak reference. These were created just to support the use case you mention.
As you create an object, you search your collection of weak references to see if the object already exists; if it does, you return a regular reference to it. If it does not, you create it and return a regular reference, and add a weak reference to it to your collection.
Your weak reference will notify you when it is not used anywhere outside of your collection; you can then remove it from your collection. With no references any where, it can then be garbage collected.
The general concept is called a "canonicalizing cache."
The WeakHashMap class is a shortcut that does some of the plumbing for this for you.
It is not clear what your requirements are. You say you want hundreds of millions of entires. This suggests that a database or NoSQL is the best way to store this.
To ensure you have no duplicates, you can keep track of referenced objects which have been retained with a WeakHashMap.
Given an aggregation of class instances which refer to each other in a complex, circular, fashion: is it possible that the garbage collector may not be able to free these objects?
I vaguely recall this being an issue in the JVM in the past, but I thought this was resolved years ago. yet, some investigation in jhat has revealed a circular reference being the reason for a memory leak that I am now faced with.
Note: I have always been under the impression that the JVM was capable of resolving circular references and freeing such "islands of garbage" from memory. However, I am posing this question just to see if anyone has found any exceptions.
Only a very naive implementation would have a problem with circular references. Wikipedia has a good article on the different GC algorithms. If you really want to learn more, try (Amazon) Garbage Collection: Algorithms for Automatic Dynamic Memory Management . Java has had a good garbage collector since 1.2 and an exceptionally good one in 1.5 and Java 6.
The hard part for improving GC is reducing pauses and overhead, not basic things like circular reference.
The garbage collector knows where the root objects are: statics, locals on the stack, etc and if the objects aren't reachable from a root then they will be reclaimed. If they are reachable, then they need to stick around.
Ryan, judging by your comment to Circular References in Java, you fell into the trap of referencing objects from a class, which was probably loaded by the bootstrap/system classloader. Every class is referenced by the classloader that loaded the class, and can thus be garbage-collected only if the classloader is no longer reachable. The catch is that the bootstrap/system classloader is never garbage collected, therefore, objects reachable from classes loaded by the system classloader cannot be garbage-collected either.
The reasoning for this behavior is explained in JLS. For example, Third Edition 12.7 http://java.sun.com/docs/books/jls/third_edition/html/execution.html#12.7.
If I remember correctly, then according to the specifications, there are only guarantees about what the JVM can't collect (anything reachable), not what it will collect.
Unless you are working with real-time JVMs, most modern garbage collectors should be able to handle complex reference structures and identify "subgraphs" that can be eliminated safely. The efficiency, latency, and likelihood of doing this improve over time as more research ideas make their way into standard (rather than research) VMs.
No, at least using Sun's official JVM, the garbage collector will be able to detect these cycles and free the memory as soon as there are no longer any references from the outside.
The Java specification says that the garbage collector can garbage collect your object
ONLY If it is not reachable from any thread.
Reachable means there is a reference, or chain of references that leads from A to B,
and can go via C,D,...Z for all it cares.
The JVM not collecting things has not been a problem for me since 2000, but your mileage may vary.
Tip: Java serialization caches objects to make object mesh transfer efficient. If you have many large, transient objects, and all your memory is getting hogged, reset your serializer to clear it's cache.
A circular reference happens when one object refers to another, and that other one refers to the first object. For example:
class A {
private B b;
public void setB(B b) {
this.b = b;
}
}
class B {
private A a;
public void setA(A a) {
this.a = a;
}
}
public class Main {
public static void main(String[] args) {
A one = new A();
B two = new B();
// Make the objects refer to each other (creates a circular reference)
one.setB(two);
two.setA(one);
// Throw away the references from the main method; the two objects are
// still referring to each other
one = null;
two = null;
}
}
Java's garbage collector is smart enough to clean up the objects if there are circular references, but there are no live threads that have any references to the objects anymore. So having a circular reference like this does not create a memory leak.
Just to amplify what has already been said:
The application I've been working on for six years recently changed from Java 1.4 to Java 1.6, and we've discovered that we've had to add static references to things that we didn't even realize were garbage collectable before. We didn't need the static reference before because the garbage collector used to suck, and it is just so much better now.
Reference counting GCs are notorious for this issue. Notably, Suns JVM doesn't use a reference counting GC.
If the object can not be reach from the root of the heap (typically, at a minimum, through the classloaders if nothing else0, then the objects will be destroyed as they are not copied during a typical Java GC to the new heap.
The garbage collector is a very sophisticated piece of software -- it has been tested in a huge JCK test-suite. It is NOT perfect BUT there is a very good chance that as long as the java compiler(javac) will compile all of your classes and JVM will instantiate it, then you should be good.
Then again, if you are holding references to the root of this object graph, the memory will NOT be freed BUT if you know what you're doing, you should be OK.