Strip version number from dependency based on group ID in maven

Strip version number from dependency based on group ID in maven - java

I have a project that has 3rd party dependencies, as well as dependencies on internal projects. I need to strip the version numbers from the dependent artifacts that are developed in-house.
For example: spring-2.5.6.jar should be in the final output as spring-2.5.6.jar but MyInternalProject-1.0.17.jar needs to be changed to MyInternalProject.jar.
I can identify the internal dependencies easily enough by their group ID (they are all something like com.mycompany.*). The maven-dependency-plugin has a stripVersion option, but it does not seem to be selective enough. Is there a way to do this, short of explicitly naming each dependency and what their final name should be?
Phrased another way:
I would like to have different outputFileNameMappings for the maven-assembly-plugin for artifacts based on group ID. Is there a way to do this?

I think you can using the following recipe:
First, in your aggregator pom use the dependency:copy-dependencies goal to copy your jars to some intermediate location. You will need two executions, one with <stripVersion>true</stripVersion> for your internal dependencies; and one with <stripVersion>false</stripVersion> for 3rd party libraries. You may include/exclude artifacts based on GroupId, see http://maven.apache.org/plugins/maven-dependency-plugin/copy-dependencies-mojo.html for full details.
Then it should be a simple task to build a .zip using the maven-assembly-plugin!

Based on the comments, I would re-evaluate your approach here. Generally checking jars into source control is not a good idea, especially unversioned jars. Imagine if I just had a project that referenced someArtifact.jar and I was trying to debug it - how would I know which version it used?
Tools like artifactory and nexus were built for storing versions of your jars, both internal and 3rd party (they can also proxy public repositories like maven central). In order to keep builds reproducible, I would check your binaries into a tool designed for that, and then you can reference them by version. During development, you can reference SNAPSHOT versions of your jars to get the latest, and then when you do a release, you can reference stable versions of your binaries.
Source control systems were meant for storing source, not binaries.

Related

How to fail a build in Jenkins dependent on a certain jar

Is there a way to fail a build in Jenkins if a certain jar is used in a Java Maven Project?
For example I know org.example:badartifact:1.0.1 has a security vulnerability. I told everyone about that, and they fixed their projects..., but maybe some third-party artifacts bring this with them as a transitive and nobody realizes that.
Or maybe someone down the line forgets this old bug...
So I would like to have a last check in Jenkins preferably, so that we don't end up with projects that have that special artifact included.
How do you handle situations like that, what tools do you use? (Whitelisting libs? Blacklisting libs?, etc)
Any suggestions are appreciated.

Possible Maven solution
You could have a company super POM (parent POM of all Maven projects within the company/department/team) and in that super POM configure the Maven Enforcer Plugin, its bannedDependencies rule to ban any library, version or even scope. I have personally used this option even for trivial mistakes (i.e. junit not in test scope would make the build fail).
This solution is a centralized one and as such easier to maintain, however requires all the projects to have the same parent POM and developers could at any time change the parent pom and as such skip this governance. On the other hand, a centralized parent POM is really useful for dependencies Management, common profiles, reporting and so on.
Note: you cannot configure it in the Maven settings of the Jenkins server via an active by default profile, for instance, in order to have it applied to all running Maven build, because Maven limits customization of builds in profiles provided by the settings (it's a design choice, to limit external impact and as such have an easier troubleshooting). I've tried it in the past and hit the wall.
Profiles in external files
Profiles specified in external files (i.e in settings.xml or profiles.xml) are not portable in the strictest sense. Anything that seems to stand a high chance of changing the result of the build is restricted to the inline profiles in the POM. Things like repository lists could simply be a proprietary repository of approved artifacts, and won't change the outcome of the build. Therefore, you will only be able to modify the and sections, plus an extra section
Possible Jenkins solution
If you want to have governance centralized in Jenkins directly, hence independently than Maven builds, I have applied these solutions in the past (and they perfectly work):
Jenkins Text Finder Plugin: you can make the build fail in case a regex or a matching text was found as part of the build output. In your case, you could have a Jenkins build step executing always mvn dependency:tree and as such have as part of the build output the list of dependencies (even transitive). A Text Finder rule matching your banned dependency will then match it and fail the build.
Fail The Build Jenkins Plugin: similar to the one above, but with a centralize management of configured Failure Causes. Again, failures are based on matching text, but no build configuration is required: it will be applied by default to all builds.

Here is one solution to do the job :)
With the Maven License plugin, you can scan the 3rd party dependencies for your Maven project and produce a THIRD_PARTY.txt report (in the target/generated-sources/license folder).
Maven command line:
mvn license:aggregate-add-third-party
Next, you can use the TextFinder plugin to search the "unsafe" dependencies in the THIRD_PARTY.txt file (ex: org.example:badartifact:1.0.1) and change the status of the build if needed.
Another solution is to use a 3rd party tool to do that.
I'm doing some investigation with this one: http://www.whitesourcesoftware.com/
This tool can provide a list of 3rd party dependencies with vulnerability issues.

How to make Jenkins consider two different builds of a Maven -SNAPSHOT jar artifact identical as part of Continuous Delivery?

EDIT: This is about doing Continuous Delivery with Maven and having it orchestrated with Jenkins. Maven is definitively not designed for that, and this question is part of our effort to get an efficient workflow without using Maven releases. Help is appreciated.
We use Maven -SNAPSHOTs within major versions to ensure customers always get the latest code for that given version, which works well. For technical reasons we have two independent Maven jobs - one for compiling sources to jars, and one for combining the appropriate jars to a given deployment. This also works well.
We then have Jenkins orchestrating when to invoke the various steps, and this is where it gets a bit tricky, because if we do the normal mvn clean install in step one, this means that all the snapshot artifacts get recompiled, which in turn makes Jenkins think that all the snapshots changed (as their fingerprint - aka MD5 checksum - changed) even if the sources used to generate the artifacts did not change, triggering all the downstream builds instead of just those which dependencies did change.
I have so far identified these things as varying between builds:
META-INF/maven/.../pom.properties (as it contains a timestamp)
META-INF/MANIFEST.MF (contains JDK and user)
timestamps in jar file
I have found ways around the two first, but the latter is a bit more difficult. It appears that AbstractZipArchiver (which does all the work in zipFile() and zipDir()) is not written to allow any kind of extension to how the archive is being generated.
For now I can imagine four approaches (but more ideas are very welcome):
Create a derivative of the current maven-jar-plugin implementation allowing for a timestamp=<number> attribute which is then used for all entries inserted into the jar file. If not set, the current behavior is kept.
Revise the Jenkins fingerprinting scheme so it knows about jar files and only looks at the entries contents, not their metadata.
Attach a plugin to the prepare-package stage responsible for touching the files with a specific time stamp. This requires all files to be present at that time (meaning that the jar plugin cannot be allowed to touch the MANIFEST.MF file)
Attach an extra plugin to the "package" phase which rewrites the finished jar file, zeroing out all zip entry timestamps in the process.
Again, the goal is to make maven SNAPSHOT artifacts fully time independent so given the same source you get an artifact with the same MD5 checksum. I also believe, however, that this could be beneficial for release builds.
How should I approach this?

As per my comment, I still think the answer is to do none of the things you suggest, and instead use releases in preference to snapshots for artifacts which you are in fact releasing to customers.
The problems you describe are:
you have a multi-module project which takes a long time to build because you have more than 100 modules,
you have two snapshot artifacts which you think ought to be identical (because the source code and metadata were identical at build time), but they have different checksums.
My experience with Maven tells me that if you try and adhere to the "Maven Way", tools will work well for you out-of-the-box, but if you deviate then you'll have a bad time. Unfortunately, the Maven Way is sometimes elusive :-)
Multi-module projects in Maven are very useful when you have families of modules with code that varies in sympathy, e.g. you have a module containing a bunch of interfaces, and some sibling modules providing implementations. It would be unusual to have more than a dozen modules in a multi-module project. All the modules ought to share the version number of the parent (Maven doesn't enforce this, which in my opinion is confusing).
When you build a snapshot version of a multi-module project, snapshots of all modules are built, even if the code in a particular module hasn't changed. Therefore you can look at a family of modules in your repositiory, and know that at compile time the inter-module code references were satisfied.
For example, in a domain model module you might have an interface:
public interface Student {
void study();
}
and in some sibling modules, which would declare compile-scoped dependencies on the domain model in their POMs, you might have implementations.
If you were then to change the interface in the domain model module:
public interface Student {
void study();
void drink(Beer beer);
}
and rebuild the multi-module project, the build will fail. The dependent modules will fail to build, even though their code and POMs have remained the same. In a multi-module project, you only install or deploy artifacts if all the child modules build successfully, so rebuilding snapshots is usually very desirable - it's telling you something about the inter-module dependencies.
If:
you have an excessive number of modules, and/or
those modules can't reasonably share the same version number, and/or
you don't need any guarantees about code references between modules,
then your modularisation is incorrect. Don't use multi-module projects as a build system (you have Jenkins for that), use it instead to express relationships between modules of your code.
In your comment, you say:
RELEASE artifacts behave the same way when being rebuilt by Jenkins.
The point of point of release artifacts is that you do not rebuild them - they are definitive! If you use something like Artifactory, you will find that you cannot deploy a release artifact more than once - your Jenkins job should fail if you attempt it.
This is a fundamental tenet in Maven. One of the aims of Maven is that it if two developers on separate workstations were to attempt the same release, they would build artifacts which were functionally identical. If you are build an artifact which expresses a dependency (maybe for compilation purposes, or because it's being assembled into .war etc.) on another, then:
if the dependency is a snapshot, Maven might seek a newer version from the repository.
if the dependency is a release, the version in your local repository is assumed to be definitive.
If you could rebuild a release artifact, you would create the possibility that two developers have dissimilar versions in their repository, and you'd have dissimilar builds depending on which workstation you used. Don't do it.
Another critical detail is that a release artifact cannot depend on snapshot artifacts, again, you would lose various guarantees.
Releases are definitive, and it sounds like you want your assembly to depend on definitive artifacts. Jenkins makes tagging and releasing multi-module projects very straightforward.
In summary:
Check your modularisation: one enormous multi-module project is not useful.
If you don't want to continually rebuild snapshots, you need to do releases.
Never release snapshots to your customer.
Follow the dependency graph of your assembly project and release any snapshots.
Release the assembly project, bumping your minor version.
Ensure your customer refers to the complete version number of your assembly in communications.

Build multiple java projects with dynamic dependencies

I have multiple java projects in a folder. Also there is a second folder with libraries, that might be used as build dependencies from the projects. The projects may also have dependencies to other Projects. What's the best approach to build all projects ?
In other words I want to build the projects without explicit telling their dependencies.I think the biggest problem is the dependecy between the projects.

There are multiple build systems that are available that you may use. Maven has a complete dependency system built into it. Almost all third party open source jars are directly accessible via the World Wide Maven repository system. Basically, you describe the jar you need (groupId, artifactId, and version) and Maven will automatically fetch it for you. Not only that, but Maven also will build your project without having to create a build file. Instead, you have to describe your project in a project object model (a pom.xml file) and Maven will download everything you need, including all compilers, etc.
Almost all new projects use Maven, but Maven has a few downsides:
Since you don't control a build process, it can sometimes feel like poking a prodding a black box to get the build to work the way you want.
Documentation can be scant -- especially if you're moving beyond basic Java compiles.
You usually have to arrange your project in a specific layout. For example, source files should go under src/main/java while JUnit tests are under src/test/java. You don't have to follow the recommended layout, but then you'd have to modify the pom.xml file this way and that to get your build to work. That defeats the whole purpose of the pom.xml in the first place.
If you already have another build system setup (like Ant), you lose everything. There's no easy way to move from Ant to Maven.
The other is called Ant with Ivy. Ivy uses Ant for building, but can access Maven's world wide repository system for third party dependencies. It's a great compromise if you already are heavily invested in Ant. I also find Ant with Ivy to be better documented than Maven (although that's not too difficult). There's an excellent chapter going over the basics of Ivy in Manning Publication's Ant in Action.
With either process, I would recommend that you build a company wide Maven repository using either Nexus or Artifactory. This way, any proprietary third party jars (like Oracle jars) can also be stored in your company wide Maven repository since they won't be in the standard World Wide Maven repository.
By the way, if this is a company wide effort, and you are moving multiple Ant projects into Ivy, I have an Ivy project I use in Github that makes things easier.
Oh, there's a third possibility called Gradle which I know nothing about. I also believe it can use the World Wide Maven repository. It's based on Groovy which is based on Java syntax, and that's about all I can say. Maybe others can fill you in on the details. The Gradle group contends it solves a lot of problems of both Ant/Ivy and Maven.

Whatever tool you use, if you have various projects interdependent, you need to be clear on the independent ones which will be built first before building the dependent projects. You need to have a clear dependency structure for your projects.

You can do this with Apache Ivy. You can lay out the locations for you common libraries, define published artifacts and inter-dependencies in an ivy.xml document in each project, and let a top-level Ant build with the Ivy tasks figure out what the build order should be based on those dependencies.

Java Dependency Management For Large Projects

I hope I can keep this question specific enough, my team at work is currently debating the best way to manage our dependencies for a huge project (150+ dependencies ~300mb).
We have two main problems
Keeping all the developers dependencies the same so we are compiling against the same files
Ensure the project (once compiled) is comliped against the same dependencies
The two ideas that have been suggested are using a BirJar (all dependencies in one file) and just adding a version number to it and using a shared folder and pointing everyone's machines at the same place.
Or making including all the dependencies in the jar when we compile it (a jar, of jars, of jars) and just have a project that "has no dependencies"
Someone also mentioned setting up an internal version of Ivy and pointing all the code to pull dependencies from there.
What are the best practices regarding massive dependency management?

Why don't you use Maven and its dependency management ?
You can specify each dependency, its particular version and its scope (compile-time, for testing, for deployment etc.). You can provide a master pom.xml (the config file) that specifies these, and developers can override if they need (say, to evaluate new versions).
e.g. I specify a pom.xml that details the particular jars I require and their versions (or range). Dependent jars are determined/downloaded automatically. I can nominate which of these jars are used for compilation vs. deployment etc. If I use a centralised repository such as Nexus I can then build my artefact (e.g. a library) and deploy that into Nexus, and it'll become available for other developers to download in exactly the same manner as 3rd party libs etc.

Incase you dont like/want to follow the Maven project structure...
If you already use Ant, then your best bet is to use Ivy for dependency management.
http://ant.apache.org/ivy/
It provides a rich set of ant tasks for dependency manipulation.
from : Ant dependency management

Specifiy classpath for maven

Quite new to maven here so let me explain first what I am trying to do:
We have certain JAR files which will not be added to the repo. This is because they are specific to Oracle ADF and are already placed on our application server. There is only 1 version to be used for all apps at anyone time. In order to compile though, we need to have these on the class path. There are a LOT of these JARS, so if we were to upgrade to a newer version of ADF, we would have to go into every application and redefine some pretty redundant dependencies. So again, my goal is to just add these JARs to the classpath, since we will control what version is actually used elsewhere.
So basically, I want to just add every JAR in a given network directory (of which devs do not have permission to modify) to maven's classpath for when it compiles. And without putting any of these JAR files in a repository. And of course, these JARs are not to be packaged into any EAR/WAR.
edit:
Amongst other reasons why I do not want to add these to the corporate repo is that:
These JARs are not used by anything else. There are a lot of them, uncommon and exclusive to Oracle.
There will only be one version of a given JAR used at anyone time. There will never be the case where Application A depends on 1.0 and Application B depends on 1.1. Both App A and B will depend on either 1.1 or 1.2 solely.
We are planning to maintain 100+ applications. That is a lot of pom.xml files, meaning anytime we upgrade Oracle ADF, if any dependency wasn't correctly specified (via human error) we will have to fix each mistake every time we edit those 100+ pom.xml files for an upgrade.

I see three options:
Put the dependencies in a repository (could be a file repository as described in this answer) and declare them with a scope provided.
Use the dirty system scope trick (i.e. declare the dependencies with a system scope and set the path to the jars in your file system.
Little variation of #2: create a jar with a MANIFEST.MF referencing all the jars (using a relative path) and declare a dependency on this almost empty jar with a system scope.
The clean way is option #1 but others would work too in your case. Option #3 seems be the closest to what you're looking for.
Update: To clarify option #3
Let's say you have a directory with a.jar and b.jar. Create a c.jar with a Class-Path entry in its META-INF/MANIFEST.MF listing other jars, something like this:
Class-Path: ./a.jar ./b.jar
Then declare a dependency in your POM on c (and only on c) with a system scope, other jars will become "visible" without having to explicitly list them in your POM (sure, you need to declare them in the manifest but this can be very easily scripted).

Although you explicitly stated you don't want them in the repository, your reasons are not justified. Here's my suggestion:
install these jars in your repostory
add them as maven dependencies, with <scope>provided</scope>. This means that they are provided by your runtime (the application server) and will not be included in your artifacts (war/ear)
Check this similar question
It is advisable that an organization that's using maven extensively has its own repository. You can see Nexus. Then you can install these jars in your repository and all developers will use them, rather than having the jars in each local repository only.
(The "ugliest" option would be not to use maven at all, put put the jars on a relative location and add them to the classpath of the project, submitting the classpath properties file (depending on the IDE))

if you are developing ADF (10g / 11g I guess) components, I suppose you'll be using JDeveloper as IDE. JDeveloper comes with a very rich Library Management Tool that allows you to define which libaries are required for compiling or which ones should be packaged for deployment. I I suppose you will already know how to add libraries to projects and indicate in the deployment profile which ones should be picked while packaging. If you want to keep your libraries out of maven, maybe this could be the best approach. Let´s say the libraries you refer too are the "Webcenter" ones, using this approach will guarantee you you have the adequate libraries as JDeveloper will come with the right version libraries.
Nevertheless, as you are using maven I would not recommend to keep some libraries out of control and maven repositories. I´d recommend choose between maven and Oracle JDeveloper library management. In our current project we are working with JDeveloper ADF 11g (and WebCenter) and we use maven, it simply make us library management easier. At the end of the day, we will have a big amount of third party libraries (say Apache, Spring, etc.) that are useful to be managed by maven and not so many Oracle libraries really required for compiling in the IDE (as you would only need the API ones and not their implementations). Our approach has been to add the Oracle libraries to our maven repository whenever they are required and let maven to control the whole dependency management.
As others say in their answers if you don´t want the dependencies to be included in any of your artifacts use <scope>provided</scope>. Once you configure your development environment you will be grateful maven does the work and you can (almost) forget about dependency management. To build the JDeveloper IDE files we are using the maven jdev plugin, so mvn jdev:jdev would build generate our project files and set up dependencies on libraries and among them to compile properly.
Updated:
Of course, you need to refer to ADF libraries in your pom files. In our project we just refer to the ones used on each application, say ADF Tag Libraries or a specific service, not the whole ADF/WebCenter stack. For this purpose use the "provided" scope. You can still let JDeveloper to manage your libraries, but we have found that it's simpler to either have a 100% JDeveloper libraries approach or a 100% maven approach. If you go with the maven approach it will take you some time to build your local repo at first, but once that's done it's very easy to maintain, and the whole cycle (development, build, test, packaging and deployment) will be simpler, having a more consistent configuration. It's true that in a future you'll have to update to later ADF versions, but as your repository structure will already be defined it should be something fast. For future upgrades I'd recommend to keep the ADF version as a property on the top pom, that will allow you to switch faster to a new version.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.