Problem background
I run a Groovy/Grails system that dynamically compiles and loads user-defined code. This is basically done via GroovyClassLoader.
I'm seeing the dynamic classes themselves both loaded AND unloaded just fine. I have added the
-XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled
flags and that works fine. The dynamic classes are being unloaded when the classloader is GC'ed.
Problem description
However my heap memory is leaking, and this is the source of the leak:
Now, the Launcher$AppClassLoader has a property parallelLockMap defined in java.lang.ClassLoader. Apparently this ConcurrentHashMap is used to hold parallel class loading locks in Java 7. Check out the source code.
The entries held in this map have String keys (the class name for which the lock applies) and Object values (the lock objects). There are two kinds of leaking keys:
"groovy.runtime.metaclass.MyDynamicallyLoadedClassNameMetaClass"
"MyDynamicallyLoadedClassNameBeanInfo"
So to me it looks like these keys are created in parallelLockMap when the *MetaClass and *BeanInfo classes are loaded, but they are not removed when the associated dynamically loaded class (and its metaclasses) are unloaded. Now this is getting quite deep into the internals of Groovy and its metaclass system, and I'm running out of expertise.
Test Case
This will eventually run you out of heap space, although it will take a while:
String newClass = "class CLASSNAME {}"
while (true) {
GroovyClassLoader gcl = new GroovyClassLoader()
Class clazz = gcl.parseClass(newClass.replace("CLASSNAME", "NewClass"+System.nanoTime()))
clazz.newInstance()
}
Be sure to run this with the above JVM flags so that you run out of heap space, not PermGen space. Again, PermGen gets garbage collected nicely, no leak there.
Questions
1) Is this a bug in Groovy or Java 7? Whose responsibility would it be to clean the parallelLockMap? Should I submit an issue report?
2) Is there a workaround? I'm thinking of using a custom ClassLoader that does not first try to delegate class loading to its parent for these MetaClass and BeanInfo classes, thus preventing the call to java.lang.ClassLoader#loadClass(..). I'm not an expert in Java/Groovy classloading, though.
EDIT: I have submitted a Groovy JIRA here.
EDIT: On the JDK side this issue has recently been reported here.
Related
The need for extracting class definitions from a heap dump comes from the way some classes are loaded dynamically, not from static lib jars, but compiled dynamically or loaded over the network. The heap dump has the same size as the actually heap so I assume all the classes are there, possibly in the permgen. The objective is to extract the definitions in the form of .class files for further examination.
There do not seem to be any tools that readily allow you to retrieve the class bits from a VM, let alone from a heap dump. Nor is it clear that the class definition is even available in the exact same format as the contents of the .class in the VM.
But there are multiple options for saving the class definition before it is loaded in the VM. You could have an agent which can store the class definition, either in heap or external storage. This should also be possible with a custom classloader, but it is possible that it is bypassed by some other customer classloader.
The popular AOP tool AspectJ has an option to save the definition of instrumented class; it can probably be used for your use-case.
I have gone through hot deployment tutorial and it works.
But i have questions about the limitations(point 3) i.e
Hot deploy has supported the code changes in the method implementation only. If you add a new class or a new method, restart is still required.
Basically why we don't need server restart if i make changes in existing method but required in case of adding method or class.
My understanding how it works :- When i make the changes in existing method or introduced a new method, Eclipse will place the file the at right location
under webserver. If class has been already loaded by classloader in perm gen space, it will unload it from permgen space and load the new the one internally without server restart so that new changes(byte code) is reflected . Is that correct ?
If yes why hot deployment does not work for new methods and new class files ?
The reasoning is quite complicated and really only fully known to people with intimate knowledge of the JVM and how it manages memory. There is a decent explanation: Java HotSwap Guide section titled Why is HotSwap limited to method bodies? (although it's really an advertisement for the JRebel product).
The gist: there are two primary factors that prevent HotSwap from handling structural changes to classes: JIT and memory allocation.
The JIT (Just In Time) compiler in the JVM optimizes the bytecode after classes have been loaded and run a few times, basically inlining many calls for increased performance. Implementing that feature safely and effectively in an environment where class signatures and structure can change would be a significant challenge.
Other problems surround what would happen regarding memory management if class structures were allowed to change. The JVM would have to modify existing instances of classes, which would mean relocating them to other parts of the heap storage. Not to mention having to relocate the class objects themselves. The JVM's memory management is already incredibly complex and highly optimized; such changes would only increase the complexity and potentially reduce performance of the JIT compiler (and likely lead to additional bugs).
I think it's safe to assume that the JVM engineers have not been willing to take the performance and bug footprint tradeoffs that would be required to support this feature. Which is why products like JRebel and others have come to exist.
As a side note, the specification itself is not limited.
It just happens some of the available implementations, including the ubiquitous Reference Implementation, are limited.
After you connect to a remote VM, you can check whether it allows to add methods or redefine classes.
You can if you run your java on a smalltalk vm. Smalltalk has been doing this basically forever, and it is one of the reasons why Smalltalkers tend to do debugger driven development as a superior form of test driven development. Smalltalk vms do the required clean-up of memory data structures. In Eliot Miranda's Spur (for Squeak, Pharo and Cuis) and Gemstone that is done lazily, but otherwise you might have to wait for all objects to be migrated. The reference implementation java vm probably has more optimizations than any smalltalk vm you could run java on a.t.m.
The answer provided by E-Riz already has a good explanation of the reasons why the standard Java HotSwap technology only supports the modifications to existing methods and not addition of new class or methods to classes.
However, as has been described in a related SO discussion the level of hot swapping you achieve is dependent on the tool chain you use. So, if you end up adding JRebel plug-in you would be able to perform hot swapping even when new methods and classes have been added.
There is another project :Hot Swap Agent - this is typically a java agent that can be used to run your Java container and you can activate it using a couple of command line parameters (as mentioned in the quickstart).
I am debugging a problem that I've had for years in a Tomcat application - a memory leak caused when restarting an application since the Webapp classloader cannot be GC'd. I've taking snapshots of the heap with JProfiler and it seems that at least some my static variables aren't getting freed up.
Certain classes have a static final member which is initialized when the class is first loaded, and because it's final I can't set it to null on app shutdown.
Are static final variables an anti-pattern in Tomcat, or am I missing something? I've just starting poking around with JProfiler 8 so I may be misinterpreting what the incoming references are telling me.
Cheers!
Luke
It is from a few years ago but this presentation I gave at JavaOne covers this topic exactly. The key steps to find the leak are in slide 11 but there is a lot of background information that might be useful as well.
The short version is:
Trigger the leak
Force GC
Use a profiler to find an instance of org.apache.catalina.loader.WebappClassLoader that has the property started=false
Trace the GC roots of that object - those are your leaks
As I note in the presentation, finding the leaks is one thing, finding what triggered them can be a lot harder.
I would recommend running on the latest stable Tomcat version as we are always improving the memory leak detection and prevention code and the warnings and errors that that generates may also provide some pointers.
Static variables should be garbage collected when the class itself is garbage collected, which in turn is the case when its class loader is garbage collected.
You can easily create memory leak by having anything that wasn't loaded by the applications classloader having a reference to any of your classes (or an instance of your classes). Look for things like callback listeners etc. that you didn't remove properly (inner/anonymous classes are easily overlooked).
A single reference to one of your classes prevents its class loader and in turn any class loaded by that class loader to be garbage collected.
Edit, example of leaking an object that prevents GC of all your classes:
MemoryMXBean mx = ManagementFactory.getMemoryMXBean();
NotificationListener nl = new NotificationListener() { ... };
((NotificationEmitter) mx).addNotificationListener(nl, ..., ...);
If you register a listener (NotificationListener here) with an object that exists ouside of your applications scope (MemoryMXBean here), your listener will stay 'live' until its explicitly removed. Since your listener instance hold a reference to its ClassLoader (your application classloader) you have now created a strong reference chain preventing GC of the classloader, and in turn, all the classes it loaded, and through that, any static variables those classes hold.
Edit2: Basically you need to avoid this situation:
[Root ClassLoader]
|
v
[Application ClassLoader]
|
v
(Type loaded by Root).addSomething()
The JVM running the application server has loaded the JRE trough the root class loader (and possibly the application server, too). That means those classes will never become eligible for GC, since there will always be live references to some of them. The application server will load your application in a separate class loader that it will not hold a reference to any longer when your application is redeployed (or at least should). But your application will share all classes from at least the JRE with the application server (at least the JRE, but commonly also the Application Server).
In the hypothetical case when the application server were to create a separate class loader (with no parent, a second root class loader practically) and try to load the JRE a second time (as private to your application) it would cause a lot of problems. Classes intended to be singletons would exists twice, and the two class hierarchies would be incapable of holding any refrences of the other (Caused by the same class loaded by different class loaders beeing different types for the JVM). They couldn't even use java.lang.Object as a reference type for the respective "other" class loaders objects.
This Blog can give you an idea about the memory leak in your application.
We encounter an OutofMemory error.
I analyzed the *.phd file, that websphere dumps, using Eclipse Memory Analyzer.
The Leak Suspect Report of MAT, provides the following information
The class "com.ibm.rmi.io.ValueHandlerPool", loaded by "com.ibm.oti.vm.BootstrapClassLoader # 0x466578", occupies 68,734,136 (50.25%) bytes. The memory is accumulated in one instance of "java.util.Hashtable$Entry[]" loaded by "com.ibm.oti.vm.BootstrapClassLoader # 0x466578".
But I am not able to related this leak suspect to any of the application's class. There is no apparent link.
Any pointers how to go about the analysis ?
Environment : We use Websphere 6.1 on jdk 1.4.2 running on Windows. The DB is oracle 10gR1.
The application is a struts-Ejb application.
Try 'drilling down' into the Entry[] instance. It should show you what the entries are.
On an unrelated note, the package com.ibm.rmi.io hints that this it might related to RMI - EJBs included.
Also look at the number of members of that hashtable. For example if you have one massive member then it would indicate something deeper (maybe cached data somewhere). If you had thousands of members in the hashtable it may indicate that you are leaking instances.
I normally find that leaking application is because of information being cached and not disposed of. Does the leak happen at startup, grow while idle or only occur under load?
I recently began profiling an osgi java application that I am writing using VisualVM. One thing I have noticed is that when the application starts sending data to a client (over JMS), the number of loaded classes starts increasing at a steady rate. The Heap size and the PermGen size remains constant, however. The number of classes never falls, even after it stops sending data. Is this a memory leak? I think it is, because the loaded classes have to be stored somewhere, however the heap and permgen never increase even after I run the application for several hours.
For the screenshot of my profiling application go here
Are you dynamically creating new classes on the fly somehow?
Thanks for your help. I figured out what the problem is. In one of my classes, I was using Jaxb to create an XML string. In doing this, JAXB ueses reflection to create a new class.
JAXBContext context = JAXBContext.newInstance(this.getClass());
So although the JAXBContext wasn't saying around in the heap, the classes had been loaded.
I have run my program again, and I see a normal plateau as I would expect.
I'm willing to bet that your problem is related to bytecode generation.
Many libraries use CGLib, BCEL, Javasist or Janino to generate bytecode for new classes at runtime and then load them from controlled classloader. The only way to release these classes is to release all references to the classloader.
Since the classloader is held by each class, this also means that you should not release the references to all classes as well [1]. You can catch these with a decent profiler (I use Yourkit - search for multiple classloader instances with the same retained size)
One catch is that the JVM does not unload classes by default (the reason is backwards compatibility - that people assume (wrongly) that static initializers would be executed only once. The truth is that they get executed every time a class is loaded.) To enable unloading, you should pass some use the following options:
-XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled
(tested with JDK 1.5)
Even then, excessive bytecode generation is not a good idea, so I suggest you look in your code to find the culprit and cache the generated classes. Frequent offenders are scripting languages, dynamic proxies (including the ones generated by application servers) or huge Hibernate model (in this case you can just increase your permgen).
See also:
http://blogs.oracle.com/watt/resource/jvm-options-list.html
http://blogs.oracle.com/jonthecollector/entry/presenting_the_permanent_generation
http://forums.sun.com/thread.jspa?messageID=2833028
You might find some hotspot flags to be of use in understanding this behavior like:
-XX:+TraceClassLoading
-XX:+TraceClassUnloading
This is a good reference:
http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp
Unless I misunderstand, we're looking here at loaded classes, not instances.
When your code first references a class, the JVM has the ClassLoader go out and fetch the information about the class from a .class file or the like.
I'm not sure under what conditions it would unload a class. Certainly it should never unload any class with static information.
So I would expect a pattern roughly like yours, where as your application runs it goes into areas and references new classes, so the number of loaded classes would go up and up.
However, two things seems strange to me:
Why is it so linear?
Why doesn't it plateau?
I would expect it to trend upwards, but in a wobbly line, and then taper off on the increase as the JVM has already loaded most of the classes your program references. I mean, there are a finite number of classes referenced in most applications.
Are you dynamically creating new classes on the fly somehow?
I would suggest running a simpler test app through the same debugger to get a baseline case. Then you could consider implementing your own ClassLoader that spits out some debug information, or maybe there is a tool to make it report.
You need to figure out what these classes being loaded are.
Yes, it's usually a memory leak (since we don't really deal with memory directly, it's more of a class instance leak). I've gone through this process before and usually it's some listener added to an old toolkit that didn't remove it self.
In older code, A listener relationship causes the "listener" object to remain around. I'd look at older toolkits or ones that haven't been through many revs. Any long-existing library running on a later JDK would know about reference objects which removes the requirement for "Remove Listener".
Also, call dispose on your windows if you recreate them each time. I don't think they ever go away if you don't (Actually there is also a dispose on close setting).
Don't worry about Swing or JDK listeners, they should all use references so you should be okay.
Use the Eclipse Memory Analyzer to check for duplicated classes and memory leaks. It might happen that the same class gets loaded more than once.
Regards,
Markus