I'm working on a web application where I have dependencies on two different jars containing two different versions of the same class. The jar files are supplied by an external vendor and cannot be changed.
I've created a custom class loader, which first first tries to load classes from a specific set of jars, and if that fails it just loads the class in the standard manner. This makes it possible to ensure that a specific set of jar files are always used first. This solves my problem.
However, I was wondering if there was an easier way.
Other than rearchitecting your app for OSGi, I'd say that's the best solution.
Related
Is there a way to automatically find out which Java classes are actually loaded (either during compile time, as far as that's possible, or during the runtime of an application), and to throw out all other classes from a JAR to create a smaller JAR? Does that actually make sense in practice?
I am talking about the application classes for an application JAR. Usually there are lots of libraries in an application, and an application rarely needs all features of those libraries. So I suspect that would make a considerably smaller application. In theory that might be done for example via an Java agent that logs which classes and resources are read by one or several runs of an application (or even just by java -verbose:class), and a maven plugin that throws out all other classes from a jar-with-dependencies. Is there already something like that?
Clarification: I am not talking about unused dependencies (JARs that are not used at all), but about removing unused parts of each included JAR.
Well, the Maven Shade Plugin has an option minimizeJar when creating an Uber-JAR for your application:
https://maven.apache.org/plugins/maven-shade-plugin/
But, as others already pointed out, this is quite dangerous, as it regularly fails to detect class accesses which are done via Reflection or other dynamic references.
It may not be a good approach automate, as application can use reflection to initialise objects or one JAR is dependent on another JAR.
Only way that I can think of is to remove each JARs one by one and check if application runs as expected. Then again in this approach all modules of the application has to be tested, since one module can work without particular dependency and other may not.
Better solution is to take care while developing. The application developer must be careful in adding a dependency and removing unwanted dependency after his/her piece of code is done.
Global strategy.
1) Find all the classes that are loaded during runtime.
2) List of all the classes available in the classpath.
3) Reduce your class path by creating copies of jars containing only classes you need.
I have done 1 and 2 part so I can help you.
1) Find out all the classes that are loaded. You need 100 % code coverage (I am not talking about tests, but production). So run all possible scenarios, so all the classes your app needs will be loaded and logged.
To log loaded classes try several approaches. Reflection, –verbose:class flag, also you can learn about java agent. It allows to modify methods during runtime. This is an example of some java agent code or another java agent example
2) To find all the classes available in jar, you can write a program. You need to know all places where application jars are placed. Loop throw these jars (You can use ZipFile), loop through ZipFileEntry entries, and collect all classes.
3) After that write a script or program that reassembles your application. For example, now you can create a new jar file for each library and put there only needed classes.
Also you may use a tool (again, you are a programmer, so write a program), which checks code for classes dependence. You do not want to remove classes if they are used for compilation. When I was a student, I wrote code alanyzer, which builds an oriented graph for classes dependencies.
As #Gokul Nath KP notes, I did this before. I manually change gradle and maven dependencies, removing one by one, and then full regression test. It took me a week (our application was small comparing to modern world enterprise systems created by hundreds of developers).
So, be creative, and in case of success, your project will be used by millions!
I have a use case where I need to dynamically load and share a predefined set of packages/classes between multiple classloaders. The goal is to increase overall performance every time we load a jar file.
We are trying to load and execute "apps". Each app contains our SDK libraries. Each app that is loaded is done so in it's own URLClassLoader to keep it isolated from other code as well as to prevent library version conflicts. Since our SDKs libraries are in each and every app, we would like to cache these specific sets of packages/classes so that the next time we load another app, all of the SDK classes do not need to be loaded again and thus removing this overhead.
Since we have several versions of our SDK out there, we are attempting to do this dynamically. Such that, when an app is loaded, we already know what version of the SDK it is. So we would like to have some implementation of a parent Classloader that is specific to this sdk version (we will call it the SDKCacheClassLoader). When the URLClassLoader (or some custom subclass of it) loads and recognizes a package/class from the sdk, we would like to forward that class to the shared parent SDKCacheClassLoader. Then, the next time we load another jar file with the URLClassLoader (setting SDKCacheClassLoader as the shared parent classloader), the parent should already have that class found and loaded from the last execution and the new jar file should not need to be scanned looking for the SDK classes again.
Anyone have any ideas on how to accomplish this goal?
Also, to be clear, I do not need to share instantiated objects, just simply the class definition since I understand that the JVM considers the same classes loaded by 2 different class loaders as completely separate classes.
I need to redefine a single class file in a spring-boot application from an external library io.external.library until they update their code. I have this working by creating a single file with the same package name and class name in my own project.
src/main/java/io/external/library/TheirFile.java
I am a bit worried that this just happens to work and in different conditions the original file might be loaded into memory.
I have seen that there may be ways to achieve this in Gradle by unzipping and repackaging the jar file. This seems to be a bit of an overkill to me though considering this is a temp solution until the third party updates their code.
What is the simplest or best practice way to achieve this?
It is a spring-boot application and my build system is Gradle.
I see many Java packages have api, impl and bundle jars (name-api.jar, name-impl.jar, name-bundle.jar). Could someone explain what those mean? Are all three needed by the app?
The idea is that you can separate the dependencies of the application; in an attempt to make applications more portable. The idea is that you can make the application dependent on the api.jar when compiling. Then when you want to run the program you can then switch in the appropriate implementation jar (impl.jar) and the appropriate resource bundle jar (bundle.jar).
As an example suppose the library does some database interaction. You write your code so that it references the api.jar. Now suppose you need it to work with a specific type of database e.g. MySQL - you would then add the impl.jar that is specific to MySQL databases to the classpath to get it to work (if you need a different database later - you only need to switch that jar in the classpath).
The bundle.jar is a bit more obscure and not as common. This could be used to supply configuration setting for the library. For example it could be used to supply language specific settings, or some more specific config. In the case of the database library it might be that the implementation is designed for all versions of MySQL, and the resource bundle jar provides config files that allow it to work for a specific MySQL version.
Often :
name-api.jar contains only the interface of the API.
name-impl.jar provides an implementation of all interfaces in the name-api.jar
name-bundle.jar bundles everything with all the needed classes to run a Java application.
api.jar contains API interfaces. These are interfaces as a contract that the implementation of the API should follow.
impl.jar is the implementation of the api.jar. You can't just have the impl.jar without the api.jar.
bundle.jar is the resources (if I'm not mistaken). Those are resources needed for the implementation code necessary to run.
I've never seen such an arrangement.
If the designer packaged the app into three JARs, then I'd say all three are needed.
But you should recognize that it's just a choice made by the designer. It's possible that s/he could have just created a single JAR with everything in it and you'd be none the wiser.
I'm guessing now, but if you were to open those JARs you'd see only interfaces in the API JAR, implementations of those interfaces in the impl JAR, and resource bundles and other .properties files in the bundle JAR. Try it and see. You'll learn something.
I have a library that writes data in either a text or binary format. It has the following three components:
common data structures
text writer (depends on 1)
binary writer (depends on 1)
The obvious way to distribute this is as 3 .jar files, so that users can include only what they need.
However, the "common data structures" component is really just two small classes so I'm considering creating only two .jar files and including the common .class files in both.
My question: What are the potential problems with doing this?
The potential version mismatch others mentioned is actually one case of a larger set of classloading problems you might face if you deploy the same class(es) in different jars.
Classloading bugs are most likely to bite you in an application server / EJB container or similar setup, where there are multiple components / apps loaded by a hierarchy of classloaders. However, if the same class is loaded by two different classloaders, these are seen as totally distinct classes by the JVM! Which may result in different runtime errors like LinkageError (e.g. if two different versions of the same class definition collide - as described in other answers), ClassCastException (if a cast is attempted between two class definitions loaded by different classloaders) etc. Believe me, classloading hell is a place you don't want to see.
I would put the whole library into a single jar to minimize that risk.
If you release a new version of your library, a user who uses both libraries (one old one new) could get runtime exceptions, when the new library gets a class from 1 from the old library (or vice versa if it is not backwards compatible). The easiest would be, to release all in one jar, so you would not get this version issue.
Potential problems is that in the future data structure versions between these two jar files could get inconsistent (say because bug fixing or a minor release). In this case you could get ClassDefNotFoundException if you need to include both jars into an application. I would recommend either to divide it in three jar files or just one bigger one.
I recommend using JarJar which is a library for packaging a number of jar files in a single jar file.
There is an Ant task for integrating into your build and your build environment can therefore just keep the raw jars and you can just have a simple deployment (but remember to include the license.txt files from the various libraries with your distribution).