I'm struggling with how to approach jar dependency hell. I have a Maven-IntelliJ Scala project that uses some aws sdk's. Recently adding the kinesis sdk has introduced incompatible versions of Jackson.
My question is: how do I systemically approach the problem of Jar hell?
I understand class loaders and how maven chooses between duplicate Jars, but I am still at a loss regarding actual practical steps to fix the issue.
My attempts at the moment are based on trial and error, and I am outlining the here with the Jackson example:
First, I see what the actual exception is, in this case NoSuchMethodError, on the Jackson data bindings ObjectMapper class. I then look at the Jackson docs to see when the method was added or removed. This is usually quite tedious, as I manually check the api docs for each version (question 1: is there a better way?).
Then, I use mvn dependency:tree to figure out which version of the Jackson I am actually using (question 2: is there an automatic way of asking maven which version of a jar is in use, rather than combing through the tree output?).
Finally, I compare the mvn dependency:tree output before adding the Kinesis SDK, and after, to detect differences in the mvn dependency:tree output, and hopefully see if the Jackson version changed. (question 3: How does maven use the libraries in shaded jars, when dependency resolution occurs? Same as any other?).
Finally, after comparing the tree outputs, I try to add the lastest working version of Jackson explicitly in the POM, to trigger precedence in the maven dependency resolution chain. If the latest does not work, I add the next most recent lib, and so forth.
This entire procedure is incredibly tedious. Besides the specific questions I asked, I am also curious about other people's systemic approaches to this problem. Does any one have any resources that they use?
I then look at the Jackson docs to see when the method was added or removed. This is usually quite tedious, as I manually check the api docs for each version (question 1: is there a better way?)
To check API (breaking) compatibility there are several tools which would automatically analyze jars and provide you the right information. From this Stack Overflow post there are nice hints for some handy tools.
JAPICC seems quite good.
Then, I use mvn dependency:tree to figure out which version of the Jackson I am actually using (question 2: is there an automatic way of asking maven which version of a jar is in use, rather than combing through the tree output?)
The maven-dependency-tree is definitely the way to go, but you can filter out since the beginning the scope and only get what you are actually looking for, using its includes option as following:
mvn dependency:tree -Dincludes=<groupId>
note: you can also provide further info to the includes option in the form groupId:artifactId:type:version or use wildcards like *:artifactId.
It seems a small hint, but in large projects with many dependencies narrowing down its output is of great help. Normally, simply the groupId should be enough as a filter, the *:artifactId is probably the fastest though if you are looking for a specific dependency.
If you are interested in a list of dependencies (not as a tree) also alphabetically ordered (quite handy in many scenarios), then the following may also help:
mvn dependency:list -Dsort=true -DincludeGroupIds=groupId
question 3: How does maven use the libraries in shaded jars, when dependency resolution occurs? Same as any other?
By shaded jars you may mean:
fat jars, which also bring it other jars into the classpath. In this case, they are seen as one dependency, one unit for Maven Dependency Mediation, its content would then be part of the project classpath. In general, you shouldn't have fat-jars as part of your dependencies since you don't have control over packed libraries it brings in.
jars with shaded (renamed) packages. In this case - again - there is no control as far as Maven Dependency Mediation is concerned: it's one unit, one jar, based on its GAVC (GroupId, ArtifactId, Version, Classifier) which makes it unique. Its content then it's added to the project classpath (according to the dependency scope, but since its package was renamed, you may have conflicts difficult to handle with. Again, you shouldn't have renamed packages as part of your project dependencies (but often you can't know that).
Does any one have any resources that they use?
In general, you should understand well how Maven handles dependencies and use the resources it offers (its tools and mechanisms). Below some important points:
dependencyManagement is definitely the entry point in this topic: here you can deal with Maven Dependency Mediation, influence its decision on transitive dependencies, their versions, their scope. One important point is: what you add to dependencyManagement is not automatically added as a dependency. dependencyManagement is only taken into account once a certain dependency of the project (as declared in the pom.xml file or via transitive dependencies) has a matching with one of its entries, otherwise it would be simply ignored. It's an important part of the pom.xml since it helps on governing dependencies and their transitive graphs and that's why is often used in parent poms: you want to handle only one and in a centralized manner which version of, e.g., log4j you want to use in all of your Maven projects, you declare it in a common/shared parent pom and its dependencyManagement and you make sure it will be used as such. Centralization means better governance and better maintenance.
dependency section is important for declaring dependencies: normally, you should declare here only the direct dependencies you need. A good rule of thump is: declare here as compile (the default) scope only what you actually use as import statement in your code (but you often need to go beyond that, e.g., JDBC driver required at runtime and never referenced in your code, it would then be in runtime scope though). Also remember: the order of declaration is important: the first declared dependency wins in case of conflict against a transitive dependency, hence by re-declaring esplicitely a dependency you can effectively influence dependency mediation.
Don't abuse with exclusions in dependencies to handle transitive dependencies: use dependencyManagement and order of dependencies for that, if you can. Abuse of exclusions make maintenance much more difficult, use it only if you really need to. Also, when adding exclusions always add an XML comment explaining why: your team mates or/and your future self will appreciate.
Use dependencies scope thoughtfully. Use the default (compile) scope for what you really need to for compilation and testing (e.g. loga4j), use test only (and only) for what is used under test (e.g. junit), mind the provided scope for what is already provided by your target container (e.g. servlet-api), use the runtime scope only for what you need at runtime but you should never compile with it (e.g. JDBC drivers). Don't use the system scope since it would only imply troubles (e.g. it is not packaged with your final artifact).
Don't play with version ranges, unless for specific reasons and be aware that the version specified is a minimum requirements by default, the [<version>] expression is the strongest one, but you would rarely need it.
use Maven property as placeholder for the version element of families of libraries in order to make sure you have one centralised place for the versioning of a set of dependencies which would all have the same version value. A classic example would be a spring.version or hibernate.version property to use for several dependencies. Again, centralisation means better governance and maintenance, which also means less headache and less hell.
When provided, import BOM as an alternative to the point above and to better handle families of dependencies (e.g. jboss), delegating to another pom.xml file the management of a certain set of dependencies.
Don't (ab)use SNAPSHOT dependencies (or as less as possible). If you really need to, make sure you never release using a SNAPSHOT dependency: build reproducibility will be in high danger otherwise.
When troubleshooting, always check the full hierarchy of your pom.xml file, using help:effective-pom may be really useful while checking for effective dependencyManagement, dependencies and properties as far as the final dependency graph would be concerned.
Use some other Maven plugins to help you out in the governance. The maven-dependency-plugin is really helpful during troubleshooting, but also the maven-enforcer-plugin comes to help. Here are few examples worth to mention:
The following example will make sure that no one (you, your team mates, your future yourself) will be able to add a well-known test library in compile scope: the build will fail. It makes sure junit will never reach PROD (packaged with your war, e.g.)
<plugin>
<artifactId>maven-enforcer-plugin</artifactId>
<version>1.4.1<.version>
<executions>
<execution>
<id>enforce-test-scope</id>
<phase>validate</phase>
<goals>
<goal>enforce</goal>
</goals>
<configuration>
<rules>
<bannedDependencies>
<excludes>
<exclude>junit:junit:*:*:compile</exclude>
<exclude>org.mockito:mockito-*:*:*:compile</exclude>
<exclude>org.easymock:easymock*:*:*:compile</exclude>
<exclude>org.powermock:powermock-*:*:*:compile</exclude>
<exclude>org.seleniumhq.selenium:selenium-*:*:*:compile</exclude>
<exclude>org.springframework:spring-test:*:*:compile</exclude>
<exclude>org.hamcrest:hamcrest-all:*:*:compile</exclude>
</excludes>
<message>Test dependencies should be in test scope!</message>
</bannedDependencies>
</rules>
<fail>true</fail>
</configuration>
</execution>
</executions>
</plugin>
Have a look at other standard rules this plugin offers: many could be useful to break the build in case of wrong scenarios:
you can ban a dependency (even transitively), really handy in many cases
you can fail in case of SNAPSHOT used, handy in a release profile, as an example.
Again, a common parent pom could include more than one of these mechanisms (dependencyManagement, enforcer plugin, properties for dependency families) and make sure certain rules are respected. You may not cover all the possible scenarios, but it would definitely decrease the degree of hell you perceive and experience.
Use Maven Helper plugin to easily resolve all conflict by excluding old versions of dependencies.
In my experience I didn't found anything fully automated, but I found the following approach quite sistematic and useful for myself:
First of all I try to have a clear map of the project structure, relations between projects and I usually use Eclipse graphical dependency view, which tells me, for example, if a dependency is omitted for conflict with another one.
Moreover it tells you the resolved dependencies for the project.
I sincerely don't use IntelliJ IDEA but I believe it has a similar feature.
Usually I try to put very common dependency higher in the structure and I exploit the <dependencyManagement> feature to take care of the version for transitive dependencies, and most important, to avoid duplicates in the project structure.
In this Maven - Manage Dependencies blog post you can find a good tutorial about dependency management.
When adding a new dependency to my project , as in your case, I take care of where it is added in my project structure and make changes accordingly, but in most cases the dependency management mechanism is capable of deal with this problem.
In this Maven Best Practices blog post you can find:
Maven's dependencyManagement section allows a parent pom.xml to define
dependencies that are potentially reused in child projects. This
avoids duplication; without the dependencyManagement section, each
child project has to define its own dependency and duplicate the
version, scope, and type of the dependency.
Obviously if you need a particular version of a dependency for a project you can always specify the version you need locally, deep in the hierarchy.
I agree with you, it could be quite tedious, but dependency management could give you a good help.
Even replacing all the jar with same name you can still have some classes with same fully qualified name. I used maven shade plugin in one of my project. It print classes with same qualified name coming from different jar. Maybe that can help you
Related
I am coming from .NET background and I need to do some JAVA work these days. One thing I don't quite understand is how JAvA runtime resolve its jar dependencies. For example, I want to use javax.jcr to do some node adding. So I know I need to add these two dependencies because I need to use javax.jcr.Node and org.apache.jackrabbit.commons.JcrUtils.
<dependency>
<groupId>javax.jcr</groupId>
<artifactId>jcr</artifactId>
<version>2.0</version>
</dependency>
<dependency>
<groupId>org.apache.jackrabbit</groupId>
<artifactId>jackrabbit-jcr-commons</artifactId>
<version>2.8.0</version>
</dependency>
</dependency>
Now I passed the compilation but I get an exception in runtime. Then someone told me to add one more dependency which solves the problem.
<dependency>
<groupId>org.apache.jackrabbit</groupId>
<artifactId>jackrabbit-jcr2dav</artifactId>
<version>2.6.0</version>
</dependency>
From my understanding, jackrabbit-jcr-commons needs jackrabbit-jcr2dav to run. If the jar misses a dependecy, how can it pass the compilation? And also how do I know I miss this particular dependency from jcr-common? This is a general question, it doesn't have to be specific to java jcr.
Java doesn't have any built-in way to declare dependencies between libraries. At runtime, when a class is needed, the Java ClassLoader tries to load it from all the jars in the classpath, and if the class is missing, then you get an exception. All the jars you need must be explicitly listed in the classpath. You can't just add one jar, and hope for Java to transitively load classes from this jar dependencies, because jar dependencies are a Maven concept, and not a Java concept. Nothing, BTW, forbids a library writer to compile 1000 interdependant classes at once, but put the compiled classes in 3 several different jars.
So what's left is Maven. I know nothing about JCR. But if a jar A published on Maven depends on a jar B published on Maven, then it should list B in its list of dependencies, and Maven should download B when it downloads A (and put both jars in the classpath).
The problem, however, is that some libraries have a loose dependency on other libraries. For example, Spring has native support for Hibernate. If you choose to use Spring with Hibernate, then you will need to explicitly declare Hibernate in your dependencies. But you could also choose to use Spring without Hibernate, and in that case you don't need to put Hibernate in the dependencies. Spring thus chooses to not declare Hibernate as one of its own dependencies, because Hibernate is not always necessary when using Spring.
In the end, it boils down to reading the documentation of the libraries you're using, to know which dependencies you need to add based on the features you use from these libraries.
Maven calculates transitive dependencies during compile-time, so compilation passes ok. The issue here is that, by default, maven won't build a proper java -cp command line to launch your application with all of its' dependencies (direct and transitive).
Two options to solve it:
Adjust your Maven project to build a "fat jar" -- jar which will include all needed classes from all dependencies. See SO answer with pom.xml snippet to do this: https://stackoverflow.com/a/16222971/162634. Then you can launch by just java -cp myfatjar.jar my.app.MainClass
For multi-module project, with several result artifacts (that is, usually, different java programs) it makes sense to build custom assembly.xml which will tell Maven how to package your artifacts and which dependencies to include. You'll need to provide some kind of script in resulting package which will contain proper java -cp ... command. As far as I know, there's no "official" Maven plugin to build such a script during compilation/packaging.
There's free Maven book which more or less explains how dependencies and assemblies work.
Your question mixes Maven (a java-centric dependency resolution tool) and Java compile-time and run-time class-resolution. Both are quite different.
A Java .jar is, in simplified terms, a .zip file of Java .class files. During compilation, each Java source file, say MyClass.java, results in a Java bytecode file with the same name (MyClass.class). For compilation to be successful, all classes mentioned in a Java file must be available in the class-path at compile-time (but note that use of reflection and run-time class-name resolution, ala Class.forName("MyOtherClass") can avoid this entirely; also, you can use several class-loaders, which may be scoped independently of each other...).
However, after compilation, you do not need to place all your .class files together into the same Jar. Developers can split up their .class files between jars however they see fit. As long as a program that uses those jars only compile-time refers to and run-time loads classes that have all their dependencies compile-time and run-time available, you will not see any runtime errors. Classes in a .jar file are not recompiled when you compile a program that uses them; but, if any of their dependencies fails at run-time, you will get a run-time exception.
When using Maven, each maven artifact (typically a jar file) declares (in its pom.xml manifest file) the artifacts that it depends on. If it makes any sense to use my-company:my-library-core without needing my-company:my-library-random-extension, it is best practice to not make -core depend on -random-extension, although typically -random-extension will depend on -core. Any dependencies of an artifact that you depend on will be resolved and "brought in" when maven runs.
Also, from your question, a word of warning -- it is highly probable that jackrabit-jcr2dav version 2.6.0 expects to run alongside jackrabbit-jcr-commons version 2.6.0, and not 2.8.0.
If I had to guess (without spending too much time delving into the Maven hierarchies of this particular project), I believe your problem is caused by the fact that jackrabbit-jcr-commons has an optional dependency on jackrabbit-api. That means that you will not automatically get that dependency (and it's dependencies) unless you re-declare it in your POM.
Generally speaking, optional dependencies are a band-aid solution to structural problems within a project. To quote the maven documentation on the subject (http://maven.apache.org/guides/introduction/introduction-to-optional-and-excludes-dependencies.html):
Optional dependencies are used when it's not really possible (for
whatever reason) to split a project up into sub-modules. The idea is
that some of the dependencies are only used for certain features in
the project, and will not be needed if that feature isn't used.
Ideally, such a feature would be split into a sub-module that depended
on the core functionality project...this new subproject would have
only non-optional dependencies, since you'd need them all if you
decided to use the subproject's functionality.
However, since the project cannot be split up (again, for whatever
reason), these dependencies are declared optional. If a user wants to
use functionality related to an optional dependency, they will have to
redeclare that optional dependency in their own project. This is not
the most clear way to handle this situation, but then again both
optional dependencies and dependency exclusions are stop-gap
solutions.
Generally speaking, exploring the POMs of your dependencies will reveal this kind of problem, though that process can be quite painful.
I'm starting to fix a java project that has used maven and while I've got the project to build, at runtime it fails with missing dependencies. I've had a look and the errors are missing optional dependencies of included compile time dependencies. I can go through and add these but it seems to me that I can have everything building and running nicely only for some piece of code that I missed to use a missing dependency and the whole thing falls apart.
What I really want to know is whether there is an automated way to find optional dependencies that I have chosen to not include. I have used mvn dependency:tree but this only shows the dependencies I have (not sure of the scope it checks) and I have tried mvn dependency:analyze but this seems to show dependencies it thinks I don't use and those that have been pulled down indirectly. What I cannot see is how to see a list of optionals I don't include.
Currently my method of working around this is to read the poms and try to work it out from there, but I don't see this as particularly robust.
For reference, I am fairly new to maven style dependency management and on the face of it like it, but this optional thing is a bit of a stumbling block for me. I understand that optionals stop me pullin down dependencies I won't be using, but it hasn't clicked for me how I can workout what optionals are available and that I do need.
I am using Eclipse Juno, m2Eclipse (also have maven 3.0.5 cli), java 6/7.
Anyone got any ideas of how I can do this better, or what I am completely overlooking?
No the things are - somewhat - just this way. Maven does not do dependency management, it allows you to do dependency management by offering tools to use and analyze them. So the work still is on the developers side. People often mix that up.
This is mainly caused by the fact that projects often have different deployment targets. As a result sometimes they collect a bunch of jar files which are copied once into tomcat and a different set of files for weblogic. So there might be a readme in your project that states what to copy prior to deployment of the maven artifacts. Or it is implicit knowledge - then you're doomed.
dependency:analyze works on bytecode not on sources. therefore it does not see what maven knows.
Maybe mvn help:effective-pom gives a better basis to analyze the whole thing? Or you could try to modify the dependency plugin to show that information as well. Maven plugins are not so hard to work with.
I'm not aware of a plugin that displays all optional transitive dependencies. But since the pom.xml files of dependencies are downloaded into the local maven repo you could do a text search there.
A while ago there was a discussion on optional dependencies as well: Best strategy for dealing with optional dependencies - it might be helpful too.
I'm not sure whether the title makes a whole lot of sense and whether this post already answers my question, but here it is:
We have a multi-module project. As you would expect this projects has a combination of internal and third party dependencies. For the third party dependencies we define these in the dependency management section of our parent POM so that we can manage the versions of these dependencies in a single common place.
As for inter-project (internal) dependencies, so far we've just entered the versions within each modules POM where a dependency is required. Then when doing a prepare with the release plugin, these versions are updated appropriately - all very nice.
What we want, like with the third party dependencies, is to be able to specify the internal dependency versions in the parent POM and therefore have a single common place. I see three potential approaches.
We do this by creating a property in the parent POM.
We do it via the dependency management section in the parent POM.
We use the project version property as the dependency version.
The preference would be to use one of the first two approaches, though there isn't really a strong reason for this. This leads me onto the main concern and question: If we use either of the first two approaches, will the release plugin still update the dependency version during the prepare stage?
All thoughts/feedback appreciated.
In the end we used the project.version property to help manage this. However, from what I understand, I believe using the dependency management section in the parent POM would also work.
I wonder if its a trivial question, which i am not aware.
In a multi-module maven project, lets say that there is a 'common' module. For example, there are 5 modules out of which 1 module is common. Is there a way to determine if the other 4 modules depend the common module class-wise i.e for each and every class in the common module, i want to know the classes in the other modules which depend on that class in common? (actually maven does not matter here though).
Does eclipse itself has this feature?
It would be great if the tool gives a diagramatic representation.
As far as I know, Maven doesn't work 'class-wise', module is its atomic element in terms of dependencies.
You can use mvn dependency:tree in order to get the dependencies per module, in eclipse/intellij enterprise edition you
have a graphical representation for the results but that's it.
Basically you must differ between compile time and run time dependencies.
Of course if you have an 'unsatisfied' compile time dependency in some class, for example, using Logger but not having log4j/other relevant library in the class path you'll get the error during the compilation of your class, Its compiler's job, not maven's.
Now runtime dependencies are even harder to track, example:
if you're running inside of some container and you define your log4j library dependency in a 'provided' scope, then you're relying on container that it will bring that library to you and will take care of all the class loading stuff.
But How maven can know what's going on inside a container?
So, Bottom line, what's you're asking for is impossible in maven and I've tried to explain why :)
Hope this helps
In Eclipse you can simply do that by selecting the class you want to examine and press CTRL-SHIFT-G which will you search for a reference within the workspace. In this case it means you have to have opened all the modules of the multi-module. The drawback is that you need to do this for every class you would like to know of.
I'm one of the developers, so I'm not unbiased, but I believe that Restructure101 is perfect for what you want. Point RS101 at the root POM and you'll see a dependency map of all the POMs, something like this:
Then you can chase dependencies from one pom to another by double-clicking to expand any item to whatever level you want. In this case I have drilled into Maven-core to discover what is used by code in maven-compat:
You can also use Restructure101 to reorganize classes between poms (like creation/improvement of a common pom as you mention), for example by dragging classes to new poms and seeing the effect on the pom-level dependencies. An action list is exported to your IDE.
The companion product Structure101 has related capabilities, worth checking, but I'd prefer Restructure101 for what you describe.
What are the possibilities to enforce restrictions on the package dependencies in a Java build system? For example, the myapp.server.bl.Customer class should not be allowed to refer to the myapp.client.ui.customlayout package.
I'm interested in either Ant-based or IDE-specific solutions.
I'd like to get an error message in the build indicating that a (custom) package dependency rule has been violated and the build aborted. I also would like to maintain the dependencies in a list, preferably in a text file, outside of the Ant scripts or IDE project files.
(I don't know Maven but I've read it here it has better support for module dependency management)
I believe Checkstyle has a check for that.
It's called Import Control
You can configure Eclipse projects to specify Access Rules. Access rules can specify "Forbidden", "Discouraged", and "Accessible" levels all with wildcard rules. You can then configure violations of either Discouraged or Forbidden to be flagged as either warnings or errors during builds.
Kind of an old article on the idea (details may be out of date):
http://www.eclipsezone.com/eclipse/forums/t53736.html
If you're using Eclipse (or OSGi) plugins, then the "public" parts of the plugin/module are explicitly defined and this is part of the model.
ivy seems like a good solution for your problem (if you are using ant). Ivy is the offical dependency management component of Ant and thus integrates nicely with ant. It is capable of resolving dependencies, handle conflicts, create exclusions and so on.
It uses a simple xml structure to describe the dependencies and is easier to use than Maven, because it only tries to address dependency resolution problems.
From the Ivy homepage:
Ivy is a tool for managing (recording, tracking, resolving and reporting) project dependencies. It is characterized by the following:
flexibility and configurability - Ivy is essentially process agnostic and is not tied to any methodology or structure. Instead it provides the necessary flexibility and configurability to be adapted to a broad range of dependency management and build processes.
tight integration with Apache Ant - while available as a standalone tool, Ivy works particularly well with Apache Ant providing a number of powerful Ant tasks ranging from dependency resolution to dependency reporting and publication.
For the IDE specific solutions, IntelliJ IDEA has a dependency analysis tool that allows one to define invalid dependencies as well.
http://www.jetbrains.com/idea/webhelp2/dependency-validation-dialog.html
The dependency violation will be shown both when compiling and live, while editing the dependent class (as error/warning stripes in the right side error bar).
Even more automation can be obtained with JetBrains' TeamCity build server, that can run inspection builds and report the above configured checks.
For another IDE independent solution, AspectJ can be used to declare invalid dependencies (and integrate the step in the build process, in order to obtain warning/error info for the issues).
Eclipse has support for this via Build Path properties / jar properties. I think it may only work across jar / project boundaries.
Maybe Classsycle can be used:
http://classycle.sourceforge.net/ddf.html
You can use multiple modules in IDEA or Maven or multiple projects in Eclipse and Gradle. The concept is the same in all cases.
A trivial interpretation would be a module for myapp.server.bl and another for myapp.client.ui.customlayout with no compile time dependencies between either of them. Now any attempt to compile code or code-complete against the opposite module/project will fail as desired.
To audit how extensive the problem already is, a useful starting point for IntelliJ IDEA is Analyzing Dependencies:
http://www.jetbrains.com/idea/webhelp/analyzing-dependencies.html
From that article you can see how to run and act on dependency analysis for your project.