I am desperate for help but was unable to find anything on the web about this particular subject (many related ones that leave my particular problem unanswered).
Specifically, I need to be able to download code (jars) from a central and external code repository. This is done by the bootstrap code that needs to add this to the classpath of a class loader to be used thereafter. This is when we enter the subject that has been discussed so many times. I don't like hacks, so I tried the following:
Attempt #1: Create an instance of URLClassLoader configured for this purpose, then invoke the "rest" of the code through it.
Failure: There are 1.5 problems here (one may be the cause of another). One is that URLClassLoader, normally, prefers to load stuff from its parent. Some code has to exist in both, possibly different versions. If the code from the parent is used, it continues using the "outer" class loader for the rest of loading, which is not what we want, even when the initial loading is OK. Secondly, some third party libraries seem to access the system class loader directly, either by design or accidentally (may get it from one of the classes loaded by it).
Attempt #2: Create my subclass of the URLClassLoader that prefers self over the parent. Overrode loadClass, getResource, getResources, getPackage, getPackages... later other methods too to make sure of this.
Failure: Didn't help (enough). That third party code still couldn't load some resources.
Attempt #3 Create another custom subclass of the URLClassLoader and set it as the system class loader using -Djava.system.class.loader=...
Failure: This worked better - went further, but still failed trying to get resources. This time it was different resources, though. I added logging to all the overridden methods to log their calls and resource names. Regular resources were MOSTLY found. Some still weren't, even though they are there (confirmed). But something I don't know about even though I tried hard to learn is about many calls with resource names that end with a slash. Some also have slashes where a dollar sign would normally appear (nested/inner class resources). Some examples that were requested but NOT found:
com/acme/foo/bar/ClassName/
com/acme/foo/bar/ClassName/InnerClassName/
When I run the downloaded code with all content on the initial/boot classpath (and do not use my classloader), everything works fine - thus my class loader breaks things, but I need it to work.
My closest guesses are:
Third party code gets hold of the true system class loader somehow, perhaps via some class that was loaded by it, then uses that. I don't see requests to it and they are bound to fail because it does not have the entire class path.
This business with resource names ending in slashes is the cause by being supported by the true system class loader but not by the URLClassLoader I am subclassing. I can only guess that the expected return URL somehow locates the collection of resources with that name as prefix. That would be tough to match, although possible. Furthermore, it appears that some slashes are in positions where a dollar sign separating the inner class name should be, i.e. in the above example (spaces added for clarity):
com/acme/foo/bar/ClassName / InnerClassName/
com/acme/foo/bar/ClassName $ InnerClassName/
Please note that I cannot rely on hacking the actual system classloader by assuming that it is a subclass of the URLClassLoader and using reflection to call its addURL(URL) method.
Is there a way to make this work? Please help!
UPDATE
I just made an additional attempt. I created a dummy wrapper classloader (extending ClassLoader, not URLClassLoader) that only logs requests, then passes them on to the parent (public methods) or superclass (protected methods). I set this to be the system class loader and manually added the entire "inner" class path to the actual outer one, then tried to run the code. That works correctly, just as it does without the custom system class loader. What was logged also identified that even the system class loader return null for these resources ending in slashes for MOST of them, but not all. I did not check whether these also work in the my real code but guessing they may - as they were not the stumbling block. Somehow the custom system classloader is still being bypassed. How?
UPDATE 2
In my custom system class loaders I have let some classes come from the outer/true system class loader, e.g. those in java.lang. I am now suspecting that I should not have and that the inner "world" must be completely isolated. That would make it problematic, though, to communicate with it and all I would have left is reflection... but not sure whether that would even work - i.e. can there be more than one java.lang.Class and/or java.lang.Object?
Given all constraints this does not appear entirely possible in a rock solid fashion as I wanted it:
a) Third party libraries may always "misbehave" and get hold of lassloaders they are not supposed to use one way or another. I looked at OneJar as suggested by fge but they have the same issue - they only detect a possibility of it.
b) There is no way to completely replace/hide the system class loader.
c) Casting the system class loader to a URLClassLoader may stop working at any moment.
It seems, you didn’t understand the class loader structure. Within an ordinary JVM of the current version, there are at least three class loaders:
The bootstrap loader which is responsible for loading the core classes. Since this involves classes like Class and ClassLoader itself, it can’t be represented by a ClassLoader instance. All classes whose getClassLoader() returns null were loaded by the bootstrap loader
The extension loader. It is responsible for loading classes within the ext/ directory of the JRE. Afaik, it may vanish in future versions. Its parent loader is the bootstrap loader
The application loader. This is the one which will be returned by ClassLoader.getSystemClassLoader() and which will be used if no other parent was specified. In the current configurations, it’s parent is the extension loader, but maybe it will have the bootstrap loader as its direct parent in future versions
The conclusion is, if you want to reload your application’s classes without the delegation to the parent loader destroying your effort, you don’t need to manipulate the class loader’s implementation. You just have to specify the right parent. It’s as simple as
URLClassLoader cl=new URLClassLoader(urls, ClassLoader.getSystemClassLoader().getParent());
That way, the new class loader’s parent will be the original application class loader’s parent, thus the already loaded application classes are not in the scope of the new loader while everything else works as usual.
Regarding the resources ending with a slash, they are rather uncommon. They may get resolved when they actually refer to a directory but that depends on the protocol of the URL and the actual handler for that protocol. E.g. it might work for file: URLs but usually doesn’t for jar: URLs unless the jar file contains pseudo-entries for directories. I’ve also seen it working for ftp: URLs.
Another thing to understand is that if one class directly refers to another class, its original defining class loader will be queried, not necessarily the application class loader. E.g. when the class java.lang.String contains a reference to java.lang.Object (it does), this reference will be directly resolved using the bootstrap loader as this is the defining loader of java.lang.String.
This implies that if you manipulate the parent lookup of a loader to not follow the standard parent delegation you are risking to resolve names to different runtime classes as the resolving of the same names when being referenced by classes loaded by the parent loader. You avoid such problems by following the standard procedure as in the solution above. The JRE classes will never contain references to your application classes and the new loader not having the original application loader as its parent will never interfere with the classes loaded by the original application loader.
Related
As we know java class loading works like in picture below.
When we have a notion of plugin in our application (like app servers) we sometimes need some classes to be loaded in the instances class loader rather than in parent, so that each new instance (plugin, webapp or whatever) loads that particular class not delegating the parent...
For example Log4j classes, we need them to be loaded for each instance.
So in order to do this the best approach that came into my mind is to write custom classloader that will take a list of class names which shall be prevented from being delegated to parent classloader (ideally we want instances to be in complete isolation).
So the the application that will load other "instances" will use that custom classloader while loading those "instances"...
Is there an implementation of such classloader that solves this issue? (given that we dont want to know what OSGi is and we work with pure java no frameworks etc...)
My searches end up pointing some frameworks or some web container specific solutions, yet this is quite a simple task that is solved with one class, I'd like to find out if i'm missing something before i start implementing it.
Update (digging deeper) : suppose there is a class loaded by bootstrap which has static state that can be shared between instances (and we really badly want to make sure instances are completely isolated), now that class is obviously not included in our classpath, but it is loaded, and if we copy it instead of reffering it we will achieve the required isolation.
So do we have the notion of copying or cloneing a class from one classloader to other?
I have a question about java ClassLoaders. I am trying to use different ClassLoaders to be able to run different versions of a JAR from within the same program.
I have heard somewhere that if you load one class using one ClassLoader all classes called (being loaded) from within that class would use the same ClassLoader. Is this correct?
If not, is there a neat way to set the context of a ClassLoader (let's say, everything being called from a specific class/library should use the same ClassLoader).
This is not a simple subject and i would advise doing more research online as no answer here will be nearly in depth enough. but, as a quick synopsis:
classes loaded via normal class references (i.e. a line of code in Class A which uses a variable of static type B) will be loaded using the same classloader as the initial class.
however, due to classloader delegation, a class may not actually be loaded by the ClassLoader from which the search originally started. example, i have Class A loaded by classloader LA with parent classloader LP. Class B is referenced by A, so the search for Class B will start with LA. however, the class bytes for B are actually found in LP, so LP loads the class and hands it to LA which returns it. ultimately, however, B is owned by LP, not LA.
with utilities which load classes via reflection (e.g. serialization, JAXB, Hibernate, etc.) or frameworks which are typically used with nested classloaders (e.g. Java EE appservers), all bets are off. typically utilities/frameworks like this load classes using the context classloader, but that is not always the case. each utility may have different priorities and fallbacks regarding which classloader is used. additionally, many have ways of explicitly providing a classloader at runtime.
as a rule of thumb, while executing code which you know is from a nested classloader (probably because you set it up), you should set the current context classloader appropriately.
Can someone clarify that the role of a ClassLoader is not only to load an individual class, but also its dependencies? And if so, what exactly does the entire process entail? I'm looking for implementation detail if at all possible.
For example, at some point, bytes are going to have to be read in from somewhere (a network or filesystem location), and file system locations are going to have to be calculated on the basis of a classes canonical name and a foreknowledge of class paths available to the JVM- how does an individual ClassLoader try to locate a file over potentially multiple class-paths? Where does it get this information from? Also, at what point are a class files bytes verified and its dependencies examined for availability?
As much detail as possible would be appreciated :)
ClassLoading is a very complex subject. The ClassLoader and Java security model are inextricably tied together. Essentially the JVM loads classes on demand. When there is a hierarchy of classloaders, the JVM attempts to resolve the class as far down the chain as possible. In short, if the class is defined in the "boot" classloader and in an application defined class loader, it will always use the version in the boot classloader.
Within a classloader, such as the URLClassLoader, the search order is the order in which you've told it to look. Essentially the array of URLs you told it had classes will be searched from the first entry to the last.
When the class that you defined references another class, that class is also resolved using the same algorithm. But here's the catch: it only resolves it relative to where it was found. Let's take the scenario where the class SomeCoolThing is in the boot classloader, but depends on SomeLameThing, which is in an application defined classloader. The process would look like this:
App-ClassLoader: resolveClass("SomeCoolThing")
parent->resolveClass("SomeCoolThing")
Boot-ClassLoader (the ultimate parent): resolveClass("SomeCoolThing")
SomeCoolThing needs SomeLameThing
resolveClass("SomeLameThing") // Can't find SomeLameThing!!!!
Even though SomeLameThing is in the classloader where you requested SomeCoolThing, SomeCoolThing was resolved in a different classloader. That other classloader has no knowledge of the child classloader, and tries to resolve it itself and fails.
I had a book a long time ago that covered the Java ClassLoaders in really good depth, and I recommend it. It's Java Security by O'Reilly Media. It will answer every question you never wanted to know, but still need to, when dealing with ClassLoaders and how they work.
I can answer some of your questions:
how does an individual ClassLoader try
to locate a file over potentially
multiple class-paths?
If you mean different class loaders have different classpaths then each class loader takes the properties (i.e. classpath) of the parent class loader. All things equal each class loader has the same classpath as any other (I believe; not sure if the JVM does anything weird internally). So MyClass.class is the same for a class loader and all child class loaders. If you have multiple MyClass.class defined on the same class path then the JVM picks up the first one. In the past I've created my own class loader and prepended a custom classpath onto the existing classpath to load classes at runtime that were not on the classpath when launched.
The get to the nuts and bolts of it I'm sure there is a spec out there that describes this or you could download the JVM code (the assembly/C/C++ code) and go though that but I've had to do that and "it ain't pretty".
Of course "they" are changing the classpath stuff in 1.7 so I'm not sure how that is going to work...
Hope that helps a bit...
I have a custom class loader so that a desktop application can dynamically start loading classes from an AppServer I need to talk to. We did this since the amount of jars that are required to do this are ridiculous (if we wanted to ship them). We also have version problems if we don't load the classes dynamically at run time from the AppServer library.
Now, I just hit a problem where I need to talk to two different AppServers and found that depending on whose classes I load first I might break badly... Is there any way to force the unloading of the class without actually killing the JVM?
Hope this makes sense
The only way that a Class can be unloaded is if the Classloader used is garbage collected. This means, references to every single class and to the classloader itself need to go the way of the dodo.
One possible solution to your problem is to have a Classloader for every jar file, and a Classloader for each of the AppServers that delegates the actual loading of classes to specific Jar classloaders. That way, you can point to different versions of the jar file for every App server.
This is not trivial, though. The OSGi platform strives to do just this, as each bundle has a different classloader and dependencies are resolved by the platform. Maybe a good solution would be to take a look at it.
If you don't want to use OSGI, one possible implementation could be to use one instance of JarClassloader class for every JAR file.
And create a new, MultiClassloader class that extends Classloader. This class internally would have an array (or List) of JarClassloaders, and in the defineClass() method would iterate through all the internal classloaders until a definition can be found, or a NoClassDefFoundException is thrown. A couple of accessor methods can be provided to add new JarClassloaders to the class. There is several possible implementations on the net for a MultiClassLoader, so you might not even need to write your own.
If you instanciate a MultiClassloader for every connection to the server, in principle it is possible that every server uses a different version of the same class.
I've used the MultiClassloader idea in a project, where classes that contained user-defined scripts had to be loaded and unloaded from memory and it worked quite well.
Yes there are ways to load classes and to "unload" them later on. The trick is to implement your own classloader which resides between high level class loader (the System class loader) and the class loaders of the app server(s), and to hope that the app server's class loaders do delegate the classloading to the upper loaders.
A class is defined by its package, its name, and the class loader it originally loaded. Program a "proxy" classloader which is the first that is loaded when starting the JVM. Workflow:
The program starts and the real "main"-class is loaded by this proxy classloader.
Every class that then is normally loaded (i.e. not through another classloader implementation which could break the hierarchy) will be delegated to this class loader.
The proxy classloader delegates java.x and sun.x to the system classloader (these must not be loaded through any other classloader than the system classloader).
For every class that is replaceable, instantiate a classloader (which really loads the class and does not delegate it to the parent classloader) and load it through this.
Store the package/name of the classes as keys and the classloader as values in a data structure (i.e. Hashmap).
Every time the proxy classloader gets a request for a class that was loaded before, it returns the class from the class loader stored before.
It should be enough to locate the byte array of a class by your class loader (or to "delete" the key/value pair from your data structure) and reload the class in case you want to change it.
Done right there should not come a ClassCastException or LinkageError etc.
For more informations about class loader hierarchies (yes, that's exactly what you are implementing here ;- ) look at "Server-Based Java Programming" by Ted Neward - that book helped me implementing something very similar to what you want.
I wrote a custom classloader, from which it is possible to unload individual classes without GCing the classloader. Jar Class Loader
Classloaders can be a tricky problem. You can especially run into problems if you're using multiple classloaders and don't have their interactions clearly and rigorously defined. I think in order to actually be able to unload a class youlre going go have to remove all references to any classes(and their instances) you're trying to unload.
Most people needing to do this type of thing end up using OSGi. OSGi is really powerful and surprisingly lightweight and easy to use,
You can unload a ClassLoader but you cannot unload specific classes. More specifically you cannot unload classes created in a ClassLoader that's not under your control.
If possible, I suggest using your own ClassLoader so you can unload.
Classes have an implicit strong reference to their ClassLoader instance, and vice versa. They are garbage collected as with Java objects. Without hitting the tools interface or similar, you can't remove individual classes.
As ever you can get memory leaks. Any strong reference to one of your classes or class loader will leak the whole thing. This occurs with the Sun implementations of ThreadLocal, java.sql.DriverManager and java.beans, for instance.
If you're live watching if unloading class worked in JConsole or something, try also adding java.lang.System.gc() at the end of your class unloading logic. It explicitly triggers Garbage Collector.
I have a custom class loader so that a desktop application can dynamically start loading classes from an AppServer I need to talk to. We did this since the amount of jars that are required to do this are ridiculous (if we wanted to ship them). We also have version problems if we don't load the classes dynamically at run time from the AppServer library.
Now, I just hit a problem where I need to talk to two different AppServers and found that depending on whose classes I load first I might break badly... Is there any way to force the unloading of the class without actually killing the JVM?
Hope this makes sense
The only way that a Class can be unloaded is if the Classloader used is garbage collected. This means, references to every single class and to the classloader itself need to go the way of the dodo.
One possible solution to your problem is to have a Classloader for every jar file, and a Classloader for each of the AppServers that delegates the actual loading of classes to specific Jar classloaders. That way, you can point to different versions of the jar file for every App server.
This is not trivial, though. The OSGi platform strives to do just this, as each bundle has a different classloader and dependencies are resolved by the platform. Maybe a good solution would be to take a look at it.
If you don't want to use OSGI, one possible implementation could be to use one instance of JarClassloader class for every JAR file.
And create a new, MultiClassloader class that extends Classloader. This class internally would have an array (or List) of JarClassloaders, and in the defineClass() method would iterate through all the internal classloaders until a definition can be found, or a NoClassDefFoundException is thrown. A couple of accessor methods can be provided to add new JarClassloaders to the class. There is several possible implementations on the net for a MultiClassLoader, so you might not even need to write your own.
If you instanciate a MultiClassloader for every connection to the server, in principle it is possible that every server uses a different version of the same class.
I've used the MultiClassloader idea in a project, where classes that contained user-defined scripts had to be loaded and unloaded from memory and it worked quite well.
Yes there are ways to load classes and to "unload" them later on. The trick is to implement your own classloader which resides between high level class loader (the System class loader) and the class loaders of the app server(s), and to hope that the app server's class loaders do delegate the classloading to the upper loaders.
A class is defined by its package, its name, and the class loader it originally loaded. Program a "proxy" classloader which is the first that is loaded when starting the JVM. Workflow:
The program starts and the real "main"-class is loaded by this proxy classloader.
Every class that then is normally loaded (i.e. not through another classloader implementation which could break the hierarchy) will be delegated to this class loader.
The proxy classloader delegates java.x and sun.x to the system classloader (these must not be loaded through any other classloader than the system classloader).
For every class that is replaceable, instantiate a classloader (which really loads the class and does not delegate it to the parent classloader) and load it through this.
Store the package/name of the classes as keys and the classloader as values in a data structure (i.e. Hashmap).
Every time the proxy classloader gets a request for a class that was loaded before, it returns the class from the class loader stored before.
It should be enough to locate the byte array of a class by your class loader (or to "delete" the key/value pair from your data structure) and reload the class in case you want to change it.
Done right there should not come a ClassCastException or LinkageError etc.
For more informations about class loader hierarchies (yes, that's exactly what you are implementing here ;- ) look at "Server-Based Java Programming" by Ted Neward - that book helped me implementing something very similar to what you want.
I wrote a custom classloader, from which it is possible to unload individual classes without GCing the classloader. Jar Class Loader
Classloaders can be a tricky problem. You can especially run into problems if you're using multiple classloaders and don't have their interactions clearly and rigorously defined. I think in order to actually be able to unload a class youlre going go have to remove all references to any classes(and their instances) you're trying to unload.
Most people needing to do this type of thing end up using OSGi. OSGi is really powerful and surprisingly lightweight and easy to use,
You can unload a ClassLoader but you cannot unload specific classes. More specifically you cannot unload classes created in a ClassLoader that's not under your control.
If possible, I suggest using your own ClassLoader so you can unload.
Classes have an implicit strong reference to their ClassLoader instance, and vice versa. They are garbage collected as with Java objects. Without hitting the tools interface or similar, you can't remove individual classes.
As ever you can get memory leaks. Any strong reference to one of your classes or class loader will leak the whole thing. This occurs with the Sun implementations of ThreadLocal, java.sql.DriverManager and java.beans, for instance.
If you're live watching if unloading class worked in JConsole or something, try also adding java.lang.System.gc() at the end of your class unloading logic. It explicitly triggers Garbage Collector.