How to avoid classloading conflicts for a java library

How to avoid classloading conflicts for a java library - java

We have a java library (more like a framework) which can be used to build JDBC drivers. The framework implements the JDBC interface, and you implement some interfaces which are consumed by the framework to realize the driver.
One issue which has come up is that if you create two drivers with this framework, and try to use them simultaneously in a single JVM, only a single set of the framework classes will be loaded.
This is bad for at least two reasons:
The framework contains singletons, so only the first driver loaded
actually works correctly.
Even if we removed all singletons, things
could break if the 'wrong version' of the framework is loaded at
runtime for one of the drivers.
The solution used was to rename the framework class packages for each driver, so that they would no longer conflict. While this works, there is a concern about the bytecode manipulation, as it happens after our testing.
My question: Is there a better way? This needs to be handled from our side, not the application's, as an application uses the driver simply as a JDBC driver, so it needs to be transparent.
One idea I had was to serialize all the classes of the driver (other than the 'entrypoint' classes which implement Driver/DataSource) into some sort of resource which is packaged in the driver jar (with the FQN of the resource and entrypoint classes being unique to the driver), then have the entrypoint classes, in a static initialization block, read out the classes and load them with a private classloader. Are there any glaring problems I'm missing with this approach?

Related

Using different versions of dependencies in separated Java platform modules

I expected it's possible to use i.e. Guava-19 in myModuleA and guava-20 in myModuleB, since jigsaw modules have their own classpath.
Let's say myModuleA uses Iterators.emptyIterator(); - which is removed in guava-20 and myModuleB uses the new static method FluentIterable.of(); - which wasn't available in guava-19. Unfortunately, my test is negative. At compile-time, it looks fine. In contrast to runtime the result is a NoSuchMethodError. Means that, the class which was the first on the classloader decides which one fails.
The encapsulation with the underlying coupling? I found a reason for myself. It couldn't be supported because of transitive dependencies would have the same problem as before. If a guava class which has version conflicts occurred in the signature in ModuleA and ModuleB depends on it. Which class should be used?
But why all over the internet we can read "jigsaw - the module system stops the classpath hell"? We have now multiple smaller "similar-to-classpaths" with the same problems. It's more an uncertainty than a question.

Version Conflicts
First a correction: You say that modules have their own class path, which is not correct. The application's class path remains as it is. Parallel to it the module path was introduced but it essentially works in the same way. Particularly, all application classes are loaded by the same class loader (by default at least).
That there is only a single class loader for all application classes also explains why there can't be two versions of the same class: The entire class loading infrastructure is built on the assumption that a fully qualified class name suffices to identify a class with a class loader.
This also opens the path to the solution for multiple versions. Like before you can achieve that by using different class loaders. The module system native way to do that would be to create additional layers (each layer has its own loader).
Module Hell?
So does the module system replace class path hell with module hell? Well, multiple versions of the same library are still not possible without creating new class loaders, so this fundamental problem remains.
On the other hand, now you at least get an error at compile or launch due to split packages. This prevents the program from subtly misbehaving, which is not that bad, either.

Theoretically it is possible to use different versions of the same library within your application. The concept that enables this: layering!
When you study Jigsaw under the hood you find a whole section dedicated to this topic.
The idea is basically that you can further group modules using these layers. Layers are constructed at runtime; and they have their own classloader. Meaning: it should be absolutely possible to use modules in different versions within one application - they just need to go into different layers. And as shown - this kind of "multiple version support" is actively discussed by the people working on java/jigsaw. It is not an obscure feature - it is meant to support different module versions under one hood.
The only disclaimer at this point: unfortunately there are no "complete" source code examples out there (of which I know), thus I can only link to that Oracle presentation.
In other words: there is some sort of solution to this versioning problem on the horizon - but it will take more time until to make experiences in real world code with this new idea. And to be precise: you can have different layers that are isolated by different class loaders. There is no support that would allow you that "the same object" uses modV1 and modV2 at the same time. You can only have two objects, one using modV1 and the other modV2.
( German readers might want to have a look here - that publication contain another introduction to the topic of layers ).

Java 9 doesn't solve such problems. In a nutshell what was done in java 9 is to extend classic access modifiers (public, protected, package-private, private) to the jar levels.
Prior to java 9, if a module A depends on module B, then all public classes from B will be visible for A.
With Java 9, visibility could be configured, so it could be limited only to a subset of classes, each module could define which packages exports and which packages requires.
Most of those checks are done by the compiler.
From a run time perspective(classloader architecture), there is no big change, all application modules are loaded by the same classloader, so it's not possible to have the same class with different versions in the same jvm unless you use a modular framework like OSGI or manipulate classloaders by yourself.

As others have hinted, JPMS layers can help with that. You can use them just manually, but Layrry might be helpful to you, which is a fluent API and configuration-based launcher for running layered applications. It allows you to define the layer structure by means of configuration and it will fire up the layer graph for you. It also supports the dynamic addition/removal of layers at runtime.
Disclaimer: I'm the initial creator of Layrry

Why does getting JDBC Connection need Reflection? [duplicate]

What is the actual use of Class.forName("oracle.jdbc.driver.OracleDriver") while connecting to a database? Why cant we just import the same class, instead why we are loading it.

The basic idea behind using Class.forName() is to load a JDBC driver implementation. A (normal) JDBC driver must contain a static initializer that registers an instance of the driver implementation with java.sql.DriverManager:
JDBC drivers must implement the Driver interface, and the implementation must contain a static initializer that will be called when the driver is loaded. This initializer registers a new instance of itself with the DriverManager
(from JDBC 4.1, section 9.2)
Since JDBC 4.0 however there is a new way to register drivers: the jar of a JDBC driver needs to include a file /META-INF/services/java.sql.Driver which contains the name(s) of the java.sql.Driver implementations in that jar. When you create a connection using the DriverManager, it will use java.util.ServiceLoader to enumerate all /META-INF/services/java.sql.Driver files in the classpath and load all drivers so they get registered.
The DriverManager.getConnection method has been enhanced to support the Java Standard Edition Service Provider mechanism. JDBC 4.0 Drivers must include the file META-INF/services/java.sql.Driver. This file contains the name of the JDBC driver’s implementation of java.sql.Driver.
(from JDBC 4.1, section 9.2.1)
The reasons drivers are loaded this way, is that it allows you to decouple an application from the driver (and database) it uses. This means that you can write, compile and even distribute an application without any drivers, you only need to use the interfaces provided in the java.sql (and javax.sql) package - which is part of Java - without needing to access the implementation directly.
The user of the application then adds a valid JDBC driver to the classpath (and configuring things like a connection string) so the application can actually to connect to a database. Before JDBC 4.0, the user would have to specify the driver name so that the application could load it using Class.forName, with a JDBC 4.0 compliant driver and Java 6 or higher this discovery is automatic.
When you load a driver literally with Class.forName("oracle.jdbc.driver.OracleDriver") it might feel like overkill, but if you keep in mind that it could also be a string pulled from a config file (or from user input) you might start to understand why it is so powerful.
Of course this driver independence is not 100%, especially not if your application uses vendor specific SQL. But the theory is that your application can be database independent. JDBC also provides some additional mechanisms to address this, eg JDBC escapes to provide a common syntax that the driver translates to the specific syntax, and DatabaseMetaData which allows you to discover features, reserved words etc which allow you to create or generate compatible queries.

A couple reasons to use Class.forName("") instead of just referencing the class directly:
Using Class.forName("") gives you more obvious control over where exactly the first attempt to load the specified class will be made in your code. This makes it more obvious where the code will fail (throw an exception) if that class is not present in the classpath when that code runs.
If you simply import the class and then reference it in your code, it becomes slightly less obvious where the code will throw an exception if the class is not present.
Also, using Class.forName("") is a way to get around potential compile-time restrictions. If, for example, the person compiling the code does not (for, let's say, licensing or intellectual property reasons) have access to the class oracle.jdbc.driver.OracleDriver, they may find it easier to compile code which references the class by Class.forName("") rather than directly.
If you do not need to use any methods, fields, or inner classes of the specified class, then Class.forName("") may be the clearest way to express that the only thing desired is to load the class (and have its static initializers run), and nothing else.
I don't think Class.forName exhibits any different functional behavior than referencing the class directly. It uses the calling class' classloader by default, which should be the same classloader that is used when referencing the class directly. There are some overloads to Class.forName("") that let you customize the class loading behavior a bit more.

It`s a legacy way to do so. Importing class you will have extra dependency
From The Java Tutorial:
In previous versions of JDBC, to obtain a connection, you first had to
initialize your JDBC driver by calling the method Class.forName. This
methods required an object of type java.sql.Driver. Each JDBC driver
contains one or more classes that implements the interface
java.sql.Driver.
...
Any JDBC 4.0 drivers that are found in your class
path are automatically loaded. (However, you must manually load any
drivers prior to JDBC 4.0 with the method Class.forName.)

Sometimes it is required to load a class during the run time. i.e, any class can be loaded into the memory location dynamically while executing the java application. The Class.forName is used to load any given class (within double quotes as String) at run time. For example, when we use IDE, we see there will be a GUI builder which allows us to drag and drop the buttons, text fields, etc. This drag and drop mechanism internally requires certain classes to be loaded at run time.
In the Class.forName (sun.jdbc.odbc.JdbcOdbcDriver), the Class belongs to the package java.lang.Class and the forName() is a static method of the java.lang.Class. The JDBC Drivers (String) will be loaded into the class dynamically at run time and forName method contains static block which creates the Driver class object and register with the DriverManager Service automatically. Since the forName() is static, we call it using the class name (Class).

When we want to execute static block of a class, without creating its object then we can use class.forName(). Most of the work that Driver class do, exists in its static block.
Now what we require in our JDBC connectivity is to get the driver registered with DriverManager and to obtain connections with it, so this can be achieved simply by getting static block executed and there is no requirement to create object of that class. This approach will give a better performance.

Consistent OSGi import of 3rd party libraries

I've been developing OSGi modules but so far I've come across a number of issues when I've had to wrap existing jars. An example of this is the use of the Oracle database driver which, even though I've wrapped the jar as bundle, just refuses to work (cannot find the driver class even though its present). This is just a single example but I've had issues with other 3rd party libraries and was wondering if there's a best practice approach to using 3rd party libraries which works every time?
Jlove

The problem in your case is that jdbc uses a class from the java runtime to find the database driver (DriverManager.getConnection). This can not work as the database driver is not accessible from the system classloader (that loaded the DriverManager class).
A way that works in OSGi is to use a DataSource instead: http://docs.oracle.com/javase/tutorial/jdbc/basics/sqldatasources.html . There you simply create the data source using new and this of course works. The problem is that it makes your user bundle depend on the specific DB driver. So the best practice is to create the DataSource centrally and publish it as service.
You can find some more details in my Apache Karaf DB Tutorial (http://www.liquid-reality.de/display/liquid/2012/01/13/Apache+Karaf+Tutorial+Part+6+-+Database+Access).
Btw. In general this kind of factories are tpyically where libraries fail in OSGi. Every lib invents another and different factory system and most of the are incompatible with the restricted classloaders of OSGi. Luckily most libs are made OSGi ready nowadays. Most times this simply means that you can also call the factory with a concrete object that you can retrieve using an OSGi service.

My preferred approach is not to wrap the library, but to unjar it, add a manifest, and re-jar it. Jars-inside-jars tend to cause issues that are hard to debug. Unjar and re-jar can be automated with a simple ant script.
Also, I like to write MANIFEST.MF manually. If the library being wrapped is small, then it's easy enough to do that. Tools like bnd that generate MANIFEST.MF for you do not always give the right results, and if you rely on them too much you don't know what is going on under the hood.

How to implement optional support for a jar if ONLY it exists

I'm developing a jar library for utility use.
I want to significantly reduce the dependencies on external jars/libraries, but provide native support for such a library if it exists in the class path.
For instance, I want the library to provide routines that work with hibernate; however I don't want the application to die with an error if the hibernate jars are not present.
Is there a way to implement this, without having to bundle the hibernate jars with my library?

During your initialization, you could use Class.forName to look up one of the Hibernate classes, and if it throws a ClassNotFoundException catch it and you know Hibernate isn't in the environment — set a flag so that you know not to do Hibernate-specific things.
Note, though, that if any of your classes refers to Hibernate classes statically (e.g., via import), those classes won't load if Hibernate isn't in the class path. The usual way to deal with that is:
Create an interface to your Hibernate-specific stuff, e.g. HibernateStuff.
Only refer to the interface from your main code, and don't refer to any Hibernate classes in your main code via import.
Have your Hibernate-specific stuff in a class implementing that interface, e.g., HibernateStuffImpl. That class can import Hibernate stuff.
Once you've determined that Hibernate is in the classpath (via Class.forName), use Class.forName to load your HibernateStuffImpl and then use Class#newInstance to create an instance of it, assigning it to a variable of type HibernateStuff.
Use that variable to call into your Hibernate-specific stuff.
You might have a HibernateStuffStub class that implements the interface by doing nothing, and use that when Hibernate isn't loaded, so your code isn't peppered with conditional statements. (The JVM is very quick at no-op calls.)
...and of course, all of the above applies to anything, not just Hibernate.

Why java need Class.forName or dynamic loading?

Say. jdbc driver need Class.forName to exec static block of a class.
Why not just run it as a class field?

Class.forName() is guaranteed to initialize the class at the time you call it. How would you propose to do it? Could you just declare a local variable without assigning it, like com.foo.Driver d;? What about a making it a member variable instead? Would you have to actually assign it? What does the spec say about how and when a class has to be loaded? Do you really want to have to think about that, or just call Class.forName()?
On a related note, it's no longer necessary to do this with many JDBC drivers. The DriverManager now uses the ServiceLoader mechanism to identify and load conforming driver classes.

The whole idea of JDBC is to not be dependant on one specific driver or implementation. The idea is you can use JDBC and configure at runtime any driver which is available. To do this you need to load the driver by name and use the JDBC methods. Unfortunately JDBC doesn't abstract away all the differences between databases like error codes, and switching to a database you haven't tested may not be a good idea.
You could take the view that for all of your libraries, you have them available at compile time and you wouldn't change the database on a wim, without a minimum re-testing and re-deploying your application. In this case linking to a specific driver (instead of using Class.forName) might be a good thing because it would force you (or whomever does this) to put more thought into the change and follow your testing procedures.

It's impractical to use technique for loading JDBC drivers other than reflection.
(Though there are different ways to do it). There's a lot of JDBC drivers and the implementation code may not be available to the app.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to avoid classloading conflicts for a java library - java

Related

Using different versions of dependencies in separated Java platform modules

Why does getting JDBC Connection need Reflection? [duplicate]

Consistent OSGi import of 3rd party libraries

How to implement optional support for a jar if ONLY it exists

Why java need Class.forName or dynamic loading?

Categories

Resources