Analysis of unused transitive dependencies on the class level - java

Assume the following situation: my Maven project depends on a jar A, which depends on 10 other jars which transitively depend on a lot more other jars. I get a huge classpath and if am building a war/ear, I get a huge artifact.
Actually, I am using only the class foo in jar A. The class foo uses a few other classes, which are contained in three other jars. So I really only need jar A and three other jars to compile, not the whole bunch of dependencies (and their dependencies and so on).
Is there a way to (semi-)automatically analyse dependency trees on the class level? As far as I know Maven has no build-in functionality for this.
Just to make this clear: I know that such situations should not occur in a good software architecture. But if I get a jar A which is really just a collection of classes for different purposes, I potentially get a lot of unnecessary dependencies when I build the dependency tree with Maven. And changing A is not something I can do.

Some (long) time ago I've started Maven plugin for this:
https://github.com/highsource/storyteller-maven-plugin
How to find unneccesary dependencies in a maven multi-project?
It works but in no way finished/documented etc. I also don't want to "sell" it here in any way.
But what you write were exactly my thoughts then. maven-storyteller-plugin basically analyzed dependencies of classes and built a huge graph of them. Then it could tell if you actually need dependencies you've declared in your project or not. It could also export nice graphs of dependencies (using GraphViz).
I never had time to finish it, but maybe someone would be interested? Heavylifting is done already.

Related

Separating test and core project with Maven

I already found this post: Separating tests from src with Maven Project?.
I've just started working on a Java project (as I usually use .net), and one of the first > things that strikes me as odd is that in the Maven project there is a /src and a /test
directory where obviously the source code and the tests should go.
In .net I preferred to have the tests in a separate assembly/project, so for example I
would have:
MyProject
MyProject.Tests
That way I dont have to bloat my deployed code with any tests and it makes it easier to
test my code in true isolation and in alot of cases I didnt bother writing tests per
project, I would just have solution wide unit/integration/acceptance tests i.e
MySolution.UnitTests, MySolution.IntegrationTests.
However in Java it just seems to be bundled together, and I would rather separate it out, > however I hear that Maven is a cruel mistress when you want to do things differently to
the default structures.
So to reign this post back in, my main [question is]:
Is there a way to separate out the tests from the project [How?]
Although this question exactly describes what I try to achieve, the thread has not provided the solution for how to do it.
I'd like to know whether there is a way to have a separate project just for (unit) testing with JUnit. I want to have the actual source code in a "core" project and the according tests in a separate "test" project instead of having one single project with src/main and src/test paths.
However, I don't know how to configure the (parent) pom.xml files to achieve that.
So far, I defined a parent pom that declares the two projects as modules. In addition, for each of the two projects, I have a separate pom file declaring the required dependencies etc. Of course, the pom file of the test project defines the core project as a dependency.
I guess I have to configure the pom file of the core project to tell the testing plugin to look in the other project for the tests. But how should such behaviour be configured?
If you follow the Maven conventions (having both the src and test folders) you will have an easier time. Your tests will not be deployed along with your compiled source so I wouldn't worry about bloat. Maven will compile both a jar and a test jar file (assuming you're using jars). If you really want separate src/test modules then yes, the multi-module approach with a common parent is the way to go. The test module would have a dependency on the source module but not the other way around. Really this just amounts to reinventing what Maven is already doing for you though.
In the long run, I think you'd be happier using the conventional approach though as things will go a lot smoother.
Do not pay attention to the naysayers who will try to convince you that you have to do it in one of the established ways or else you will run into trouble. This is cargo cult engineering, and it reflects the cowardice of your average enterprise employee out there, who will rather die than try something different or think outside the box for a moment.
It is perfectly doable to have a huge multi-module maven project, with loads of tests, where not a single module contains both production and test subfolders, and instead every single module is either production, or test. That's the only way they do it in the DotNet world, and I never heard anyone complaining.
There exist situations where you absolutely have to split your modules this way, so maven has no option but to support this. Such situations arise when the dependencies are such that the tests of module A depend on module B which in turn depends on the production code of model A. If both the tests and production code of module A are in the same actual module, this causes a circular dependency, so the project is unbuildable. Such an arrangement is not commonplace, but it does happen some times. When it happens, you have to move the tests of A into a separate module C, which depends on both A and B, and leave only production code on A.
In maven, there is nothing special to it: in production modules you only specify <sourceDirectory>, while in test modules you only specify <testSourceDirectory>. Everything else is done as expected: Both modules have the same parent pom, and the parent pom references them both. JUnit and other test-related dependencies are only included by the test modules. It is so straightforward that it is trivial. (I am not sure what kind of trouble the OP was facing that made him ask the question.)
As a matter of fact, if it was not for the particular maven plugins that people use for running tests during continuous deployment, you would not even need <testSourceDirectory>, you could be using in all modules nothing but <sourceDirectory>. IntelliJ IDEA does not have a problem detecting and running tests even if they are under <sourceDirectory>, but the maven surefire plugin does expect tests to be under <testSourceDirectory>, so you have to use <testSourceDirectory> just to keep that plugin happy.
My personal opinion is that supporting a distinction between production and test subfolders within the same module adds a mind-boggling amount of completely unnecessary complication to build systems. The entire java world would be doing just fine if the feature did not exist at all. Of course this opinion is tentative, since unbeknownst to me there may exist important reasons due to which this distinction is useful. If anyone knows of any such reasons, please enlighten me in the comments.

Java: Tool to determine the dependents of every java class in a maven module

I wonder if its a trivial question, which i am not aware.
In a multi-module maven project, lets say that there is a 'common' module. For example, there are 5 modules out of which 1 module is common. Is there a way to determine if the other 4 modules depend the common module class-wise i.e for each and every class in the common module, i want to know the classes in the other modules which depend on that class in common? (actually maven does not matter here though).
Does eclipse itself has this feature?
It would be great if the tool gives a diagramatic representation.
As far as I know, Maven doesn't work 'class-wise', module is its atomic element in terms of dependencies.
You can use mvn dependency:tree in order to get the dependencies per module, in eclipse/intellij enterprise edition you
have a graphical representation for the results but that's it.
Basically you must differ between compile time and run time dependencies.
Of course if you have an 'unsatisfied' compile time dependency in some class, for example, using Logger but not having log4j/other relevant library in the class path you'll get the error during the compilation of your class, Its compiler's job, not maven's.
Now runtime dependencies are even harder to track, example:
if you're running inside of some container and you define your log4j library dependency in a 'provided' scope, then you're relying on container that it will bring that library to you and will take care of all the class loading stuff.
But How maven can know what's going on inside a container?
So, Bottom line, what's you're asking for is impossible in maven and I've tried to explain why :)
Hope this helps
In Eclipse you can simply do that by selecting the class you want to examine and press CTRL-SHIFT-G which will you search for a reference within the workspace. In this case it means you have to have opened all the modules of the multi-module. The drawback is that you need to do this for every class you would like to know of.
I'm one of the developers, so I'm not unbiased, but I believe that Restructure101 is perfect for what you want. Point RS101 at the root POM and you'll see a dependency map of all the POMs, something like this:
Then you can chase dependencies from one pom to another by double-clicking to expand any item to whatever level you want. In this case I have drilled into Maven-core to discover what is used by code in maven-compat:
You can also use Restructure101 to reorganize classes between poms (like creation/improvement of a common pom as you mention), for example by dragging classes to new poms and seeing the effect on the pom-level dependencies. An action list is exported to your IDE.
The companion product Structure101 has related capabilities, worth checking, but I'd prefer Restructure101 for what you describe.

Finding out all conflicting packages/classes of referenced jars in an Eclipse project

I am currently dealing with a huge Eclipse project (not written by me). This project doesn't use any dependency management tools. It references hundreds of JARs.
Some of these JARs contain same packages (and classes), but in different versions. Currently, resolving conflicts works by manually (and randomly!) reordering these JARs in Order&Export (in Project Properties).
This was done for a long time now, and there are now lots of packages/classes with different vendors/versions/product-lines.
Reordering causes some parts of the project to fail while other parts start working, and oppositely.
Strangely, lots of orders do not cause build errors, but only runtime errors.
Can this mess be solved by an tool, which would suggest certain automatic order of dependent JARs?
Google for JarAnalyzer, that helps at least to figure how the dependecies are build up. Use the jars, your eclipse project is producing, as well. However you can not really automate this. Imagine one of your eclipse projects in needing bad-1.0.jar and another one uses bad-1.2.jar. Very often you can not replace the 1.0 one with the 1.2 one because your project wont compile any more. So in the long run you have to REMOVE outdated jars, switch to a "common version" amoung all subprojects and fix the compiler errors. And while you do that, switch to ivy or maven.
Do your jar files even have proper names or do you have 3 different versions of bad.jar which look the same in the filesystem but are in fact of different version? If so, start by renaming all relevant jar files to include the version number (can often eb found in the manifest file) ... heck I once did what you do and wrote me with JArAnalyzer, a bit groovy and some shell scripts a small tool that generated all the ivy files for the project.
you can use maven, ivy to clean the mess :) . And that spring doesn't work properly try this:first clean then build the project.
"Strangely, lots of orders do not cause build errors, but only runtime
errors."
This is not strange. As you wrote, classes are present in different versions, which does not necessarily means compilation error, but means different behaviour and different sub dependencies.
Avoid a "random" or "automatic order" approach. I would advise you the usage of Maven for handling your dependencies (in order to know precisely which library depends on which one). You will probably discover that many of the libraries you're including are not required, and that the dependency management tool will handle for you "automatically" all dependencies between dependencies, you will have however to add/force exclusion for specific libraries versions.
Much more, it will help you to simplify the code and eventually remove one line of code and 40 dependencies...(relying on a side framework misused such Spring or any other one).

How to examine required libraries?

I developing a web application with a lot of libraries like, Spring, Apache CXF, Hibernate, Apache Axis, Apache Common and so one. Each of these framework comes with a lot of *.jar libraries.
For development I simple take all of the delivered libraries and add them to my classpath.
For deployment not all of these libraries are required, so is there a quick way to examine all the required libraries (*.jar) which are used by my source code?
If you move your project to use Maven such things become easier:
mvn dependency:analyze
mvn dependency:tree
For your example, Maven + IDE + nice dependency diagrams could help allot.
See an example of this : it's much easier this way to figure out what happens in a project, and this way you don't need to add to your project "all delivered libraries" - just what it's required.
JDepend traverses Java class file
directories and generates design
quality metrics for each Java package.
JDepend allows you to automatically
measure the quality of a design in
terms of its extensibility,
reusability, and maintainability to
manage package dependencies
effectively.
So, as a quick, dirty, and potentially inefficient way, you can try this in Eclipse:
Create two copies of your project.
In project copy #2 remove all the jars from the classpath.
Pick a source file that now has errors because it can't resolve a class reference. Pick one of the unresolved classes and note its fully qualified class name.
Do Control-Shift-T and locate the unresolved class. You should be able to see which jar its contained in since all the jars are still in the classpath for project copy #1.
Add the jar that contains this unresolved class back into your classpath in project copy #2, then repeat steps 3 and 4 until all class references are resolved.
Unfortunately you're not done yet since the jar files themselves may also have dependencies. Two ways to deal with this:
Go read the documentation for all the third-party packages you're using. Each package should tell you what its dependencies are.
Run your application and see if you get any ClassNotFoundExceptions. If you do, then use Control-Shift-T to figure out what jar that class comes from and add it to your classpath. Repeat until your project runs without throwing any ClassNotFoundExceptions.
The problem with #2 is that you don't really know you've resolved all the dependencies since you can't simulate every possible execution path your project might take.

Whats the best way to resolve dependencies between Java projects?

I think most of you will know, programmers often reuse code from other software. I think, most of the time it is a good idea. But if you use code from another project your program depends on the other project.
I my current case I got three java projects A, B and C. Now A uses B and B uses C. I'm using eclipse IDE and added B to the buildpath of A and C to the buildpath of B. Now there is an compiler error that A can't resolve something from C. So I have to add C to the buildpath of B.
So what is the best way, to resolve the dependencies while keeping your programm as independent as possible from other projects?
I would like to know is in general and in reference to my current situation. Are there better ways to do this? I.e. there are classpath settings in the launch / debug configuration view, but I think they won't help at compile time.
Thanks in advance.
This sounds like part of the problem set fixed by Maven. Using Maven and Eclipse, namely m2eclipse, you can have projects use other projects and all the dependency resolution is handled for you.
It sounds to me like you're doing what you have to without incorporating a dependency management tool like Ivy or Maven, which provide you the capability of "transitive dependency management". With either of these tools you can just specify that A depends on B and B depends on C and they will automatically know that A is going to need C as well.
The advantages of Maven (this is what I have experience in) also comes into play when it's time to package your projects for deployment since it can easily gather all of those dependencies (all the way down the hierarchy) and place them together into a distribution folder or a fat JAR that contains all of your dependencies. It takes some reading and set-up time to get into a tool like Maven, but it does make the task of managing your dependencies a whole lot easier, especially as they grow.
We use Maven and it's essential for our projects. It's a good time for you to learn - dependencies on more than 3 projects can be frightening. Maven deals with versions so that if, for whatever reason, you have to depend on Foo.1.2.3 then Maven will ensure you don't get the wrong version.
However it's not trivial. If you use Netbeans it's built in better than Eclipse and may help you learn. (Also projects are fairly switcheable between the two systems).
Maven supports a lot of concept in its POM (pom.xml) file including licence info, contributors, arguments, etc. so you get a lot more than just dependency management. And it supports modularisation of projects.
Don't skip the learning curve - you need to know how it works. But you will also find previous SO questions that will help
Others have mentioned several of the good tools, maven probably being the most common. Ivy is another one that is more targeted at just dependency management. I personally use gradle which has some of the best of all of those features underneath a familiar groovy wrapper... that is still evolving and spottily documented. ;)
One thing to be aware of is how these tools handle transitive dependencies. In your example, C is a transitive dependency of A because A depends on B which depends on C. Some of these build tools will handle this type of dependency differently and it can surprise you when you least expect it.
For example, if A actually refers to code from C, ie: it has a compile-time dependency on C, then your A->B->C setup will work in something like Maven. On the other end, gradle will also make you declare that A depends on C... since it does. Runtime dependencies are fully resolved either way.
The surprise comes when you've been transitively including something for months and some of your code has relied on aspects of C and you decide you no longer need a B dependency. Suddenly your code won't build until you figure out you need a A->C dependency specified. In this example, that's pretty trivial to discover but sometimes it isn't.
And if talk like that makes your head swim a little and you don't plan on your project getting much more complicated... then you can probably just stick with what you are doing for a while. As others mentioned, it's the right way to do it without a tool helping you.
Use maven to manage your dependencies and then use the dependency plugin to see the dependencies.
you can run
mvn dependency:analyze
or
mvn dependency:tree -Dverbose=true
this will help you a lot.
no doubt you should use a dependency management tool as people have noted... manually though, archive B and C in B_C.jar. Test that B's dependence on C is resolved within the Jar.
Then add B_C.jar in the classpath...
Dependency management is a huge topic. Maven, Ivy and other tools have been developed to ease the pain with some success. Both of those tools create a dependency hierarchy so you don't run into the situation you described. They also have Eclipse plugins so that Eclipse will recognize that hierarchy.
To truly use these frameworks, you will have to change your current build process. Maven probably requires more of a commitment than Ivy, but neither is trivial, and understanding how to set it up takes some time. That said, it is very helpful to have your dependencies defined and managed clearly.
Free maven books:
http://www.sonatype.com/documentation/books

Categories

Resources