Possible Memory leak in Number of Loaded classes in Java Application

Possible Memory leak in Number of Loaded classes in Java Application - java

I recently began profiling an osgi java application that I am writing using VisualVM. One thing I have noticed is that when the application starts sending data to a client (over JMS), the number of loaded classes starts increasing at a steady rate. The Heap size and the PermGen size remains constant, however. The number of classes never falls, even after it stops sending data. Is this a memory leak? I think it is, because the loaded classes have to be stored somewhere, however the heap and permgen never increase even after I run the application for several hours.
For the screenshot of my profiling application go here

Are you dynamically creating new classes on the fly somehow?
Thanks for your help. I figured out what the problem is. In one of my classes, I was using Jaxb to create an XML string. In doing this, JAXB ueses reflection to create a new class.
JAXBContext context = JAXBContext.newInstance(this.getClass());
So although the JAXBContext wasn't saying around in the heap, the classes had been loaded.
I have run my program again, and I see a normal plateau as I would expect.

I'm willing to bet that your problem is related to bytecode generation.
Many libraries use CGLib, BCEL, Javasist or Janino to generate bytecode for new classes at runtime and then load them from controlled classloader. The only way to release these classes is to release all references to the classloader.
Since the classloader is held by each class, this also means that you should not release the references to all classes as well [1]. You can catch these with a decent profiler (I use Yourkit - search for multiple classloader instances with the same retained size)
One catch is that the JVM does not unload classes by default (the reason is backwards compatibility - that people assume (wrongly) that static initializers would be executed only once. The truth is that they get executed every time a class is loaded.) To enable unloading, you should pass some use the following options:
-XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled
(tested with JDK 1.5)
Even then, excessive bytecode generation is not a good idea, so I suggest you look in your code to find the culprit and cache the generated classes. Frequent offenders are scripting languages, dynamic proxies (including the ones generated by application servers) or huge Hibernate model (in this case you can just increase your permgen).
See also:
http://blogs.oracle.com/watt/resource/jvm-options-list.html
http://blogs.oracle.com/jonthecollector/entry/presenting_the_permanent_generation
http://forums.sun.com/thread.jspa?messageID=2833028

You might find some hotspot flags to be of use in understanding this behavior like:
-XX:+TraceClassLoading
-XX:+TraceClassUnloading
This is a good reference:
http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp

Unless I misunderstand, we're looking here at loaded classes, not instances.
When your code first references a class, the JVM has the ClassLoader go out and fetch the information about the class from a .class file or the like.
I'm not sure under what conditions it would unload a class. Certainly it should never unload any class with static information.
So I would expect a pattern roughly like yours, where as your application runs it goes into areas and references new classes, so the number of loaded classes would go up and up.
However, two things seems strange to me:
Why is it so linear?
Why doesn't it plateau?
I would expect it to trend upwards, but in a wobbly line, and then taper off on the increase as the JVM has already loaded most of the classes your program references. I mean, there are a finite number of classes referenced in most applications.
Are you dynamically creating new classes on the fly somehow?
I would suggest running a simpler test app through the same debugger to get a baseline case. Then you could consider implementing your own ClassLoader that spits out some debug information, or maybe there is a tool to make it report.
You need to figure out what these classes being loaded are.

Yes, it's usually a memory leak (since we don't really deal with memory directly, it's more of a class instance leak). I've gone through this process before and usually it's some listener added to an old toolkit that didn't remove it self.
In older code, A listener relationship causes the "listener" object to remain around. I'd look at older toolkits or ones that haven't been through many revs. Any long-existing library running on a later JDK would know about reference objects which removes the requirement for "Remove Listener".
Also, call dispose on your windows if you recreate them each time. I don't think they ever go away if you don't (Actually there is also a dispose on close setting).
Don't worry about Swing or JDK listeners, they should all use references so you should be okay.

Use the Eclipse Memory Analyzer to check for duplicated classes and memory leaks. It might happen that the same class gets loaded more than once.
Regards,
Markus

Related

Why is Java's debugging Hot Swap limited to intra-method changes?

I have gone through hot deployment tutorial and it works.
But i have questions about the limitations(point 3) i.e
Hot deploy has supported the code changes in the method implementation only. If you add a new class or a new method, restart is still required.
Basically why we don't need server restart if i make changes in existing method but required in case of adding method or class.
My understanding how it works :- When i make the changes in existing method or introduced a new method, Eclipse will place the file the at right location
under webserver. If class has been already loaded by classloader in perm gen space, it will unload it from permgen space and load the new the one internally without server restart so that new changes(byte code) is reflected . Is that correct ?
If yes why hot deployment does not work for new methods and new class files ?

The reasoning is quite complicated and really only fully known to people with intimate knowledge of the JVM and how it manages memory. There is a decent explanation: Java HotSwap Guide section titled Why is HotSwap limited to method bodies? (although it's really an advertisement for the JRebel product).
The gist: there are two primary factors that prevent HotSwap from handling structural changes to classes: JIT and memory allocation.
The JIT (Just In Time) compiler in the JVM optimizes the bytecode after classes have been loaded and run a few times, basically inlining many calls for increased performance. Implementing that feature safely and effectively in an environment where class signatures and structure can change would be a significant challenge.
Other problems surround what would happen regarding memory management if class structures were allowed to change. The JVM would have to modify existing instances of classes, which would mean relocating them to other parts of the heap storage. Not to mention having to relocate the class objects themselves. The JVM's memory management is already incredibly complex and highly optimized; such changes would only increase the complexity and potentially reduce performance of the JIT compiler (and likely lead to additional bugs).
I think it's safe to assume that the JVM engineers have not been willing to take the performance and bug footprint tradeoffs that would be required to support this feature. Which is why products like JRebel and others have come to exist.

As a side note, the specification itself is not limited.
It just happens some of the available implementations, including the ubiquitous Reference Implementation, are limited.
After you connect to a remote VM, you can check whether it allows to add methods or redefine classes.

You can if you run your java on a smalltalk vm. Smalltalk has been doing this basically forever, and it is one of the reasons why Smalltalkers tend to do debugger driven development as a superior form of test driven development. Smalltalk vms do the required clean-up of memory data structures. In Eliot Miranda's Spur (for Squeak, Pharo and Cuis) and Gemstone that is done lazily, but otherwise you might have to wait for all objects to be migrated. The reference implementation java vm probably has more optimizations than any smalltalk vm you could run java on a.t.m.

The answer provided by E-Riz already has a good explanation of the reasons why the standard Java HotSwap technology only supports the modifications to existing methods and not addition of new class or methods to classes.
However, as has been described in a related SO discussion the level of hot swapping you achieve is dependent on the tool chain you use. So, if you end up adding JRebel plug-in you would be able to perform hot swapping even when new methods and classes have been added.
There is another project :Hot Swap Agent - this is typically a java agent that can be used to run your Java container and you can activate it using a couple of command line parameters (as mentioned in the quickstart).

Groovy MetaClass (related) memory leak with Java 7 parallel class loader

Problem background
I run a Groovy/Grails system that dynamically compiles and loads user-defined code. This is basically done via GroovyClassLoader.
I'm seeing the dynamic classes themselves both loaded AND unloaded just fine. I have added the
-XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled
flags and that works fine. The dynamic classes are being unloaded when the classloader is GC'ed.
Problem description
However my heap memory is leaking, and this is the source of the leak:
Now, the Launcher$AppClassLoader has a property parallelLockMap defined in java.lang.ClassLoader. Apparently this ConcurrentHashMap is used to hold parallel class loading locks in Java 7. Check out the source code.
The entries held in this map have String keys (the class name for which the lock applies) and Object values (the lock objects). There are two kinds of leaking keys:
"groovy.runtime.metaclass.MyDynamicallyLoadedClassNameMetaClass"
"MyDynamicallyLoadedClassNameBeanInfo"
So to me it looks like these keys are created in parallelLockMap when the *MetaClass and *BeanInfo classes are loaded, but they are not removed when the associated dynamically loaded class (and its metaclasses) are unloaded. Now this is getting quite deep into the internals of Groovy and its metaclass system, and I'm running out of expertise.
Test Case
This will eventually run you out of heap space, although it will take a while:
String newClass = "class CLASSNAME {}"
while (true) {
GroovyClassLoader gcl = new GroovyClassLoader()
Class clazz = gcl.parseClass(newClass.replace("CLASSNAME", "NewClass"+System.nanoTime()))
clazz.newInstance()
}
Be sure to run this with the above JVM flags so that you run out of heap space, not PermGen space. Again, PermGen gets garbage collected nicely, no leak there.
Questions
1) Is this a bug in Groovy or Java 7? Whose responsibility would it be to clean the parallelLockMap? Should I submit an issue report?
2) Is there a workaround? I'm thinking of using a custom ClassLoader that does not first try to delegate class loading to its parent for these MetaClass and BeanInfo classes, thus preventing the call to java.lang.ClassLoader#loadClass(..). I'm not an expert in Java/Groovy classloading, though.
EDIT: I have submitted a Groovy JIRA here.
EDIT: On the JDK side this issue has recently been reported here.

Load existing class objects in JVM from another JVM

How to load existing class objects in JVM from another JVM?
I am analyzing a rare scenario in my server. I do not have proper logs in my sever to help me solve the situation and I believe that it can be a problem with a particular class object (user defined).
Say for example below is the class:
public class MyRequest
{
public byte[] getData()
{
return somdata;
}
}
Currently in my server's JVM, 100's of the above class object is in my JVM's memory. I want to know if there is a possibility to load all the 100 objects and access their data/method (getData()).
I do not want to create an new instance of the MyRequest class (that I know is pretty easy). I want to load the existing objects from my JVM through another Java process.
P.S : I can not kill my server for any reason.
P.S : And I can not install any tools like visualvm etc and more over tools tell us the objects type,memory but not the exact data.

Basically, it won't work.
If you can't attach a debugger, you can't do anything.
If you could attach a debugger, you should be able find and look at those instances, but you won't be able to get them to do something they weren't designed to do. In particular, if they are not designed to be serializable, you won't be able to serialize them.
I think your best bet is to change your server code to improve the logging, and then restart it with a debugger agent ... and wait for the problem to recur.
And of course, if you have a debugger attached, you don't need to move objects to another JVM. You can just look at their state directly.
However, there's a catch. Many "amazingly rare" scenarios are actually related to threading, thread-safety and timing problems. And many things you can do to observe the effects of a such a bug are liable to alter the program's behaviour.
FOLLOWUP
So if we know the starting address of the Virtual memory for that JVM...can we not know the data? assuming all objects are within the JVM memory space.
It is not as simple as that:
Locations of objects on the Java heap are not predictable.
Locations of thread stacks are not predictable.
and so on.
It may be theoretically possible to dump the memory of any process, and reconstruct the execution state of the JVM, and "read" the state of the objects. But you'd need specialized tools and/or a great deal of knowledge of JVM internals to do this. I'm not even sure if the tools exist ...
In short, it is not practical, AFAIK.

Objects and their references (aliases) are bound to the current running JVM. There is no possibility to share them between several JVMs.
If you want to "share" data between two JVMs, you must serialize this data, which means sending them from on JVM to the other. This also requires the classes, whose instances shall be serialized, to implement the interface Serializable. Note, that arrays automatically implement Serializable.
You can either stream those serializable objects yourself using sockets, output and input streams (which is much effort) or you can use RMI for calling remote methods and just stream your data. In either case, all objects are copied and built up again in the other JVM. There is no chance to have them shared.
In case of application servers, RMI calls are typically invoked by just using EJBs. But you need an application server; just using a web server is not enough.

Load existing class objects in JVM from another JVM
Its not possible

Note that you can tell the JVM to dump its state - with a kill signal or similar - to disk so you can use post-Mortem tools to analyze the state of your program.
Keywords are "core" and "hprof" and I have not done this myself yet.

How to identify the cause of a JNI global reference memory leak?

I'm using Tomcat and after stopping my web application there's still a reference to the classloader instance of my web application.
With the consequence that a notable amount of memory (mostly related to static data) will not be freed. Sooner or later this results in an OutOfMemoryError.
I took a heap dump and I realized that its held by a JNI global reference which prevents that the classloader will be garbage collected.
My application does not use JNI. I am also not using the Apache Tomcat Native Library. I am using a Sun/Oracle JDK.
I'd like to track down the cause/origin of this global reference.
(My guess is that the JVM internally references the classloader - but why/where?).
Question:
Which approaches/toolsets exists to achieve this?
UPDATE
It seems that bestsss is right and the JNI global references has been introduced by the jvm debug mode. This helped me out but it does not answer the question so I am still curious to get an answer to the question which might be helpful in the future.

Besides the obvious case: Threads, there is one more:
Are you using your application in debug mode?
The JVM does not hold references to any classloader besides the system one, but it doesn't concern you. The rest of JNI references are either Threads or just debug held objects (provided you don't use JNI and lock the objects down yourself).
JNI references are just roots, edit your answer and post what exactly objects are held by those references.

The first thing i'd do is run with -Xcheck:jni on and see if it comes up with anything. I wouldn't expect it to; it doesn't sound there's anything weird happening with JNI, just incorrect use being made of it. However, it's good to make sure of that.
If you're on a Sun JVM, i think you can do -XX:TraceJNICalls to get an overwhelming listing of JNI calls as they happen. That should let you get an idea of what calls are being made, and from there work towards what is making them, and why this is causing a problem.

JRockit mission control: http://download.oracle.com/docs/cd/E13150_01/jrockit_jvm/jrockit/tools/index.html
A nice GUI tool that should help you find it pretty quick.

You could try jstack.
Maybe one of the listed stacktraces will show you the origin of the global reference.

How can I profile Java memory usage without an instumented JVM (Java 1.1.8)

I am currently trying to determine the cause of high memory usage in a Java application running on an exotic platform where I know of no instrumented JVM.
I have the source to the application, and can make changes to the source for the purposes of testing.
How can I debug memory usage under these conditions?
If more info is needed, I'll be happy to provide. I'm just a little lost trying to use such an old jvm without much tooling to speak of.

If I were in your shoes I would approach it with:
Find the functional areas you know
need attention.
Make backup copy of code
Start inserting print statements
with start and end times
See what takes a lot of time and
narrow it down.

For Java 5 and later this can be done using Java agents. For earlier versions - including 1.1.8 - you must load native agents to do this. If you cannot instrument your code, you must do the work needed yourself.
One approach to get most of the way is to use a Java 1.1 compatible version of log4j which allows you to essentially write out strings prepended with a timestamp. This can then be massaged afterwards into extracting answers to whatever you want to know.

If you need memory profiling - and I'd recommend against this - you could start serializing objects out to disk, then measuring disk size as a rough estimate of memory size.
If you really want to dig into where you're usually not supposed to be, try the sun.misc package, although I don't know how much of that was around in 1.1.x.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.