I'm trying to pass a serialized object from one process to another but it appears as though the classloaders have been changed from the default Java implementation and are conflicting.
How can I create an object from the one process using some standard Java classloader so that I can guarantee consistency?
I've never worked with classloaders in Java before so my understanding here is very limited.
The work of serialization is not done by classloaders. In fact, the only involvement that classloaders have are that it may be necessary to load a class when deserializing an object of that class ... if the class isn't already loaded.
It is hard to see how changing a classloader could cause problems. However, it is possible that you could have problems if a class loader was loading the an incompatible version of a class. That can cause a conflict. But the root cause is the different class versions, not the different class loaders.
How can you avoid the conflict? The simple answer is to not change classes when you have serialized instances. There are ways of dealing with class changes (using custom readObject / writeObject methods for example) but is better to avoid the problem.
This is one reason why Java object serialization is not a very good mechanism for persisting data.
Related
I have built in my application the ability to reload classes. However, I need to clarification on the behaviour of the class loader.
Let me explain what I know and then ask the questions...
What I do is provide a special jar that is loaded by a custom classloader. Then during the bootstrap of the jar a bunch of spring beans are created and the ones which implement certain interfaces get registered for use by the central app.
Now I kick off a process in the application which uses these new classes. I can successfully unload the "jar", change classes in the jar and reload and I get the changes. Unloading in class loader parlance means making the class loader that loaded the classes unreachable - this causes any classes loaded by that class loader to be unreachable and thus effectively unloaded.
However, from what I know so far, once the vm has loaded the class it stores it in some shared space so does not have to load it again.
The issue that I have is that sometimes the unload and reload of a new class does not work. The old class stays around. It stands to reason that if I unload a class loader (i.e. make the class loader unreachable) and one of the classes in that class loader is currently in use (an object of that type exists) then the class cannot be unloaded.
Is this true? It seems to be true in practice.
If that is so, how do I successfully unload classes that are in use at the time the class loader goes unreachable. Could I for example, put a weak reference on each class so I can detect when the class goes unreachable and act when the class goes unreachable? (not sure what action I could take though).
Update in response to #Kayaman
My use case is that I have the core application and then based on the customers requirements I can load different classes that implement known interfaces in the core application (so they can be accessed). Then the core application kicks off various processes which use these classes. The big strength of this is that I can update these plugin classes without doing a big redeploy and every customer does not need every one of these. The problem comes in when I want to load a new version of one of these and the current version is in use. Kinda stands to reason that this is not possible.
Conclusion
#Kayamam many thanks for your consult. It has been very helpful. This has codified my thinking somewhat. The conclusion is that no matter what technique I use you cannot ever, with the VM the way it is, reload a class for which there is currently a strongly reachable object. For some of my reloads I have control over these objects as I can make them unreachable before I unload and reload but for other classes I cannot do this... that is where my problem lies. What I need to do is to ring fence the objects of the classes that I wish to reload so that I can the process that is using them to pause while I reload the classes for those objects.
As you said, in order to unload a class, you need to get rid of the classloader. For example URLClassLoader can be used to load classes, and then null out the reference to make it eligible for GC and therefore unload the classes it loaded.
However, all classes know which classloader loaded them. That means that if you've got instances of your classes in use, they have a reference to the Class which has a reference to the ClassLoader and will prevent it from being collected and classes being unloaded. This is understandable, since having an object without a class would be quite an interesting situation.
So for a full reload you need to get rid of old instances and get rid of the classloader. That also avoids situations where you get mysterious exceptions about MyClass != MyClass.
WeakReference (or PhantomReference would probably be better here) would allow you to notice when your existing objects get collected, you just need to make sure you're tracking them all.
Spring adds to the complexity here, so I strongly recommend spending some time imagining that this approach is impossible, and to see if Spring has something that could be used to fulfil your business requirement. After all it does a lot of classloading itself, so you might be reinventing the wheel in a less clear way.
With quick Googling I found this http://docs.spring.io/spring-boot/docs/current/reference/html/howto-hotswapping.html which mentions among other things Spring Loaded which can apparently do class reloading and then some.
I have two different webapps, and each load the same class A with different classloader. When I put one instance in the session and then get it from the other webapp, a ClassCastException is thrown.
For example, in webapp A, I store a in the session, then in webapp B, I get the a from the session and cast it to A, the ClassCastException is thrown.
Is there a way to resolve this?
Is there a way to resolve this?
Basically no.
As far as the JLS is concerned, the types are different types, and there is no way that the JVM will allow you to pretend otherwise. For instance, the classes could have different code and different object layouts. If you could trick the JVM into treating the types as the same, you would be able to blow away JVM runtime safety. That way lies insanity.
The solution is to make sure that you don't have two different class loaders loading the same class. In the context of Tomcat, this means that if two or more webapps need to share instances of a class, then that class must be defined in a classloader that is common to both; e.g. put the JAR file in the $CATALINA_HOME/lib or $CATALINA_HOME/common directory.
If there is a technical reason why the classes have to be loaded by different classloaders (maybe because the classes really are different), then you could work around the problem by defining an interface that both versions of the class implement, and then programming to the interface rather than the implementation class. Of course, the interface needs to be loaded by a shared classloader ... or else you run into the same problem again.
You should avoid this situation, basically - either put both bits of functionality in the same webapp, or move the library containing class A into an appropriate location such that only one classloader will be used. Two classes loaded by different classloaders are entirely distinct in the JVM - you simply won't be able to cast between them.
See the Tomcat classloader documentation for more details about the various classloaders used. It looks like you'd want to put this common class into the common classloader area. As the documentation notes, this is pretty unusual, but if you really want to share an object between two webapps (which is also unusual) it's probably the easiest way forward.
You can't. Two classes loaded by different classloaders are different.
Perhaps you can serialize the shared objects?
I'm not necessarily advocating this approach, but then again that's what serialization essentially does --- you serialize from some JVM or ClassLoader X and load it in (deserialize) it into another JVM/ClassLoader Y...
You can't cast two object from different classes even if the classes have the same package name and signature, however you can copy the data from one to another simply by using apache bean utils library, BeanUtils.copyProperties(o1, o2);
This is possible with a workaround.
While it is true that you can't cast an object from one class loaded by classloader A to the same class loaded by classloader B (classes with the same name are not compatible if loaded under different classloaders as explained here), in a webapp container such as Jetty or Tomcat, you can load that class once in a parent classloader, which will be used by all webapps in the JVM. Each webapp classloader will defer to the shared (parent) classloader for the class definition and casting it back and forth works just fine.
For example, with Jetty, use WebAppContext.addSystemClass() as described here.
I have a custom class loader so that a desktop application can dynamically start loading classes from an AppServer I need to talk to. We did this since the amount of jars that are required to do this are ridiculous (if we wanted to ship them). We also have version problems if we don't load the classes dynamically at run time from the AppServer library.
Now, I just hit a problem where I need to talk to two different AppServers and found that depending on whose classes I load first I might break badly... Is there any way to force the unloading of the class without actually killing the JVM?
Hope this makes sense
The only way that a Class can be unloaded is if the Classloader used is garbage collected. This means, references to every single class and to the classloader itself need to go the way of the dodo.
One possible solution to your problem is to have a Classloader for every jar file, and a Classloader for each of the AppServers that delegates the actual loading of classes to specific Jar classloaders. That way, you can point to different versions of the jar file for every App server.
This is not trivial, though. The OSGi platform strives to do just this, as each bundle has a different classloader and dependencies are resolved by the platform. Maybe a good solution would be to take a look at it.
If you don't want to use OSGI, one possible implementation could be to use one instance of JarClassloader class for every JAR file.
And create a new, MultiClassloader class that extends Classloader. This class internally would have an array (or List) of JarClassloaders, and in the defineClass() method would iterate through all the internal classloaders until a definition can be found, or a NoClassDefFoundException is thrown. A couple of accessor methods can be provided to add new JarClassloaders to the class. There is several possible implementations on the net for a MultiClassLoader, so you might not even need to write your own.
If you instanciate a MultiClassloader for every connection to the server, in principle it is possible that every server uses a different version of the same class.
I've used the MultiClassloader idea in a project, where classes that contained user-defined scripts had to be loaded and unloaded from memory and it worked quite well.
Yes there are ways to load classes and to "unload" them later on. The trick is to implement your own classloader which resides between high level class loader (the System class loader) and the class loaders of the app server(s), and to hope that the app server's class loaders do delegate the classloading to the upper loaders.
A class is defined by its package, its name, and the class loader it originally loaded. Program a "proxy" classloader which is the first that is loaded when starting the JVM. Workflow:
The program starts and the real "main"-class is loaded by this proxy classloader.
Every class that then is normally loaded (i.e. not through another classloader implementation which could break the hierarchy) will be delegated to this class loader.
The proxy classloader delegates java.x and sun.x to the system classloader (these must not be loaded through any other classloader than the system classloader).
For every class that is replaceable, instantiate a classloader (which really loads the class and does not delegate it to the parent classloader) and load it through this.
Store the package/name of the classes as keys and the classloader as values in a data structure (i.e. Hashmap).
Every time the proxy classloader gets a request for a class that was loaded before, it returns the class from the class loader stored before.
It should be enough to locate the byte array of a class by your class loader (or to "delete" the key/value pair from your data structure) and reload the class in case you want to change it.
Done right there should not come a ClassCastException or LinkageError etc.
For more informations about class loader hierarchies (yes, that's exactly what you are implementing here ;- ) look at "Server-Based Java Programming" by Ted Neward - that book helped me implementing something very similar to what you want.
I wrote a custom classloader, from which it is possible to unload individual classes without GCing the classloader. Jar Class Loader
Classloaders can be a tricky problem. You can especially run into problems if you're using multiple classloaders and don't have their interactions clearly and rigorously defined. I think in order to actually be able to unload a class youlre going go have to remove all references to any classes(and their instances) you're trying to unload.
Most people needing to do this type of thing end up using OSGi. OSGi is really powerful and surprisingly lightweight and easy to use,
You can unload a ClassLoader but you cannot unload specific classes. More specifically you cannot unload classes created in a ClassLoader that's not under your control.
If possible, I suggest using your own ClassLoader so you can unload.
Classes have an implicit strong reference to their ClassLoader instance, and vice versa. They are garbage collected as with Java objects. Without hitting the tools interface or similar, you can't remove individual classes.
As ever you can get memory leaks. Any strong reference to one of your classes or class loader will leak the whole thing. This occurs with the Sun implementations of ThreadLocal, java.sql.DriverManager and java.beans, for instance.
If you're live watching if unloading class worked in JConsole or something, try also adding java.lang.System.gc() at the end of your class unloading logic. It explicitly triggers Garbage Collector.
I have one java program that has to be compiled as 1.4, and another program that could be anything (so, 1.4 or 1.6), and the two need to pass serialized objects back and forth. If I define a serializable class in a place where both programs can see it, will java's serialization still work, or do I need 1.6-1.6 or 1.4-1.4 only?
Make sure the classes to be serialized define and assign a value to static final long serialVersionUID and you should be ok.
That said, normally I would not do this. My preference is to only use normal serialization either within a single process, or between two processes are on the same machine and getting the serialized classes out of the same jar file. If that's not the case, serializing to XML is the better and safer choice.
Along with the serialVersionUID the package structure has to remain consistent for serialization, so if you had myjar.mypackage.myclass in 1.4 you have to have myjar.mypackage.myclass in 1.6.
It is not uncommon to have the Java version or your release version somewhere in the package structure. Even if the serialVersionUID remains the same between compilations the package structure will cause an incompatible version exception to get thrown at runtime.
BTW if you implement Serializable in your classes you should get a compiler warning if serialVersionUID is missing.
In my view (and based on some years of quite bitter experience) Java native serialization is fraught with problems and ought to be avoided if possible, especially as there is excellent XML/JSON support. If you do have to serialize natively, then I recommend that you hide your classes behind interfaces and implement a factory pattern in the background which will create an object of the right class when needed.
You can also use this abstraction for detecting the incompatible version exception and doing whatever conversion is necessary behind the scenes for migration of the data in your objects.
Java library classes should have compatible serialised forms between 1.4 and 1.6 unless otherwsie stated. Swing explicitly states that it is not compatible between versions, so if you are trying to serialise Swing objects then you are out of luck.
You may run into problems where the code generated by javac is slightly different. This will change the serialVersionUID. You should ensure you explicitly declare the UID in all your serialisable classes.
No, different version of the JVM will not break the serialization itself.
If some of the objects you are serializing are from the Java runtime, and their classes have evolved incompatibly, you will see failures. Most core Java classes are careful about this, but there have been discontinuities in some packages in the past.
I've successfully used serialization (in the context of RMI) with classes from different compilations on different machines running different versions of the Java runtime for years.
I don't want to digress too far from the original question, but I want to note that evolving a serialized class always requires care, regardless of the format. It is not an issue specific to Java serialization. You have to deal with the same concepts whether you serialize in XML, JSON, ASN.1, etc. Java serialization gives a fairly clear specification of what is allowed and how to make the changes that are allowed. Sometimes this is restrictive, other times it is helpful to have a prescription.
If both sides uses the same jar file, it will work most of the times. However if you use different versions of the same package/module/framework ( for instance different weblogic jars or extended usage of some "rare" exceptions ) a lot of integration test is needed before it can be approved.
I have a custom class loader so that a desktop application can dynamically start loading classes from an AppServer I need to talk to. We did this since the amount of jars that are required to do this are ridiculous (if we wanted to ship them). We also have version problems if we don't load the classes dynamically at run time from the AppServer library.
Now, I just hit a problem where I need to talk to two different AppServers and found that depending on whose classes I load first I might break badly... Is there any way to force the unloading of the class without actually killing the JVM?
Hope this makes sense
The only way that a Class can be unloaded is if the Classloader used is garbage collected. This means, references to every single class and to the classloader itself need to go the way of the dodo.
One possible solution to your problem is to have a Classloader for every jar file, and a Classloader for each of the AppServers that delegates the actual loading of classes to specific Jar classloaders. That way, you can point to different versions of the jar file for every App server.
This is not trivial, though. The OSGi platform strives to do just this, as each bundle has a different classloader and dependencies are resolved by the platform. Maybe a good solution would be to take a look at it.
If you don't want to use OSGI, one possible implementation could be to use one instance of JarClassloader class for every JAR file.
And create a new, MultiClassloader class that extends Classloader. This class internally would have an array (or List) of JarClassloaders, and in the defineClass() method would iterate through all the internal classloaders until a definition can be found, or a NoClassDefFoundException is thrown. A couple of accessor methods can be provided to add new JarClassloaders to the class. There is several possible implementations on the net for a MultiClassLoader, so you might not even need to write your own.
If you instanciate a MultiClassloader for every connection to the server, in principle it is possible that every server uses a different version of the same class.
I've used the MultiClassloader idea in a project, where classes that contained user-defined scripts had to be loaded and unloaded from memory and it worked quite well.
Yes there are ways to load classes and to "unload" them later on. The trick is to implement your own classloader which resides between high level class loader (the System class loader) and the class loaders of the app server(s), and to hope that the app server's class loaders do delegate the classloading to the upper loaders.
A class is defined by its package, its name, and the class loader it originally loaded. Program a "proxy" classloader which is the first that is loaded when starting the JVM. Workflow:
The program starts and the real "main"-class is loaded by this proxy classloader.
Every class that then is normally loaded (i.e. not through another classloader implementation which could break the hierarchy) will be delegated to this class loader.
The proxy classloader delegates java.x and sun.x to the system classloader (these must not be loaded through any other classloader than the system classloader).
For every class that is replaceable, instantiate a classloader (which really loads the class and does not delegate it to the parent classloader) and load it through this.
Store the package/name of the classes as keys and the classloader as values in a data structure (i.e. Hashmap).
Every time the proxy classloader gets a request for a class that was loaded before, it returns the class from the class loader stored before.
It should be enough to locate the byte array of a class by your class loader (or to "delete" the key/value pair from your data structure) and reload the class in case you want to change it.
Done right there should not come a ClassCastException or LinkageError etc.
For more informations about class loader hierarchies (yes, that's exactly what you are implementing here ;- ) look at "Server-Based Java Programming" by Ted Neward - that book helped me implementing something very similar to what you want.
I wrote a custom classloader, from which it is possible to unload individual classes without GCing the classloader. Jar Class Loader
Classloaders can be a tricky problem. You can especially run into problems if you're using multiple classloaders and don't have their interactions clearly and rigorously defined. I think in order to actually be able to unload a class youlre going go have to remove all references to any classes(and their instances) you're trying to unload.
Most people needing to do this type of thing end up using OSGi. OSGi is really powerful and surprisingly lightweight and easy to use,
You can unload a ClassLoader but you cannot unload specific classes. More specifically you cannot unload classes created in a ClassLoader that's not under your control.
If possible, I suggest using your own ClassLoader so you can unload.
Classes have an implicit strong reference to their ClassLoader instance, and vice versa. They are garbage collected as with Java objects. Without hitting the tools interface or similar, you can't remove individual classes.
As ever you can get memory leaks. Any strong reference to one of your classes or class loader will leak the whole thing. This occurs with the Sun implementations of ThreadLocal, java.sql.DriverManager and java.beans, for instance.
If you're live watching if unloading class worked in JConsole or something, try also adding java.lang.System.gc() at the end of your class unloading logic. It explicitly triggers Garbage Collector.