How to block a maven dependency from getting downloaded - java

I want to block a library so that developers are not able to download any old version of library or some libraries altogether.
Maven has a enforcer plugin but we cant enforce everyone to use it at enterprise level.
I am looking for a solution which can block this even if it is mention in the POM. Want to fail the build if blocked library is present.
I am exploring multiple solution either at maven level or by using some scanning tools.
Objective is to do it as early as possible. Artifact scanning is one way but it happens after the build when artifacts are generated which is waste of time.
Any suggestions ?

Really, the only way you can actually control this is by controlling the firewall and the intranet repository. Don't allow direct access to public repositories, only allow the use of a supported intranet repository within the firewall, which has controlled and monitored access to public repositories.

Related

Is there ever a case for internally hosting public artifacts on a private Maven repository?

There are times when a given Java third party is not in central Maven, possibly because it is very old (e.g., pre-dates Maven) or because - for whatever reason - it has never been uploaded there. One solution is to create a local repo (local Nexus server) for these assets. However, I am wondering how common this is and whether this is just plain wrong. Is there ever a case where one would want to or should do this?
Yes, this is one common use case for a Maven Repository Manager such as Nexus (eg. Oracle JDBC drivers comes to mind).
There are many more use cases why it's considered a best practice to always use a Maven Repository Manager for any significant usage of Maven artifacts.

When to run my own maven repository?

I'm working on a small project with one other developer. We are using libraries that are all available in public maven repositories. We have a single, multi-module maven project but build all of the modules every time in our build server and developer machines. The built artifacts are available on the build server for deployment.
Is there any reason for us to setup a maven repository manager? It would run on the build machine, which I would have to connect to via VPN and is slower than downloading from the internet directly.
The best reason to run a repo manager is to achieve reliable behavior in the face of maven's treatment of <repository> declarations.
If any dependency you reference, or plugin you use, or dependency of a plugin you use, declares any repositories, maven will start searching for everything in those repositories. And, if they are unreliable or slow, there goes your build. While current policy is to forbid repository declarations in central, there are plenty of historical poms pointing every-which way.
The only cure for this is to set up a repository manager and declare it to be a 'mirrorOf' '*' in settings.xml. Then, put the necessary rules in your repository manager to search only the places you want to search.
Of course, if there are more than one of you and you want to share snapshots or releases, you'll really want a repo manager, too. That's the obvious reason.
If I turn off my repo manager and just try to build some of my projects that have the bad luck to have plugin/dependencies with repository declarations in their poms, they fail. With the repo manager, all is well.
If you have to ask, probably not.
But it's a pretty cheap date - Nexus is pretty trivial to setup, I don't know about archiva.
Public servers tend to be unreliable.
A local mirror will be faster.
You don't want to get hung up because you cleaned your local repo and can't get something off the public one.
If and when you get into needing to work with multiple versions of your app, I'd start thinking about it harder.
In my opinion, it only becomes a good reason when the projects and the teams working on them grows more and more, specially because the 'versioning' capabilities of it.
My answer is, that you will have to create an own repository as soon as you will need a jar, which is not in the public repositories (not so hard, since for example some JDBC drivers are not hosted in public repo's for licensing issues). If you don't do so, you will loose the main advantage of Maven, that your colleagues do not have to do anything to get the proper libraries. If you introduce a new library, you do not need to say anything to the colleagues if you have company repository, but if not, you need to send the library for them and you need to tell them, how to install the library locally.

Can I assign a maven dependency to a specific repo?

So I have several large projects that use up to 8 different external repos, all specified in settings.xml rather than in poms. A lot of our internal dependencies are snapshots, so this obviously causes a lot of checking for updates across several external repos, when they are all in our internal repo.
So my question is, is there a way to setup a profile/filter or anything similar where I can ensure that an update will only be checked for in a specific repo(s)?
This is all in the spirit of better/quicker builds.
There is no such feature in Maven proper, though you can achieve the same by installing repository manager and replacing all references in your settings.xml with a single reference to the repo manager. This also simplifies and centralizes the configuration for the whole team and gives you more control over what you use.
Both Sonatype Nexus and JFrog Artifactory support "artifact routing" using wildcards or regexes. If it's too difficult to propose something like this in your organization, you can install the repo manager on your desktop - usually they do not take too much resources.

Tips for maintaining an internal Maven Repository?

I'm interested in maintaining a Maven 2 repository for my organization. What are the some of the pointers and pitfalls that would help.
What are guidelines for users to follow when setting up standards for downloading from or publishing their own artifacts to the repository when releasing their code? What kinds of governance/rules do you have in place for this type of thing? What do you include about it in your developer's guide/documentation?
UPDATE: We've stood up Nexus and have been very happy with it - followed most of Sal's guidelines and haven't had any trouble. In addition, we've restricted deploy access and automated build/deployment of snapshot artifacts through a Hudson CI server. Hudson can analyze all of the upstream/downstream project dependencies, so if a compilation problem, test failure, or some other violation causes the build to break, no deployment will occur. Be weary of doing snapshot deployments in Maven2/Maven3, as the metadata has changed between the two versions. The "Hudson only" snapshot deployment strategy will mitigate this. We do not use the Release Plugin, but have written some plumbing around the Versions plugin when going to move a snapshot to release. We also use m2eclipse and it seems to work very well with Nexus, as from the settings file it can see Nexus and knows to index artifact information for lookup from there. (Though I have had to tweak some of those settings to have it fully index our internal snapshots.) I'd also recommend you deploy a source jar with your artifacts as a standard practice if you're interested in doing this. We configure that in a super POM.
I've come across this Sonatype whitepaper which details different stages of adoption/maturity, each with different usage goals for a Maven Repository manager.
I would recommend setting up one nexus server with at least four repositories. I would not recommend artifactory. The free version of nexus is perfectly fine for a dev team of less than 20 in less than three groups. If you have more users than that, do yourself a favor and pay for the Sonatype release. The LDAP integration pays for itself.
Internal Release
Internal Snapshot
Internal 3rd Party for code used in house that comes from outside sources, or for endorsed 3rd party versions. Put the JDBC drivers, javax.* stuff and stuff from clients and partners here.
External Proxies common proxy for all the usual sources like m2, codehaus etc
Configure Nexus to do the following for internal repos
Delete old Snapshots on regular intervals
Delete Snapshots on release
Build index files. This speeds up local builds too
Have a common settings.xml file that uses these four and only these four sources. If you need to customize beyond this try to keep a common part of the settings file and use profiles for the differences. Do not let your clients just roll their own settings or you will end up with code that builds on one machine but not on any other machine.
Provide a common proxy for your clients. In Nexus, you can add a bunch of proxies to the common Maven sources (Apache, JBoss, Codehaus) and have a single proxy exposed to the internal clients. This makes adding and removing sources from your clients much easier.
Don't mix Internal and 3rd party artifacts in the same repository. Nexus allows you to add jars to an internal repository via a web gui. I recommend this as the way of adding your JDBC drivers and other external code to 3rd party. The UI is quite nice to use when compared to most enterprise software.
Define a common parent POM that defines the Internal snapshot and release repos via the distributionManagement tag. I know lots of people tell you not to do this. And while I freely admit that there are all kinds of problems with doing this, it works out OK if the clients will only be building releases and snapshots to be deployed to a single internal repository.
If you have an existing mis-managed Maven repository, create a 5th repos called Legacy and put the whole repos there. Set up a cron task to delete old files from legacy once they are a year old. That gives everyone a year to move off of it and update their poms.
Establish an easy to stick to naming convention for internal artifacts. I prefer GroupID of Department.Function.Project and an ArtifactId for that componentName. For internal repositories, com/org/net and the company name are likely to be irrelevant. And wrong if the company changes its name. It is far less likely that the sales, accounting or inventory department will be renamed.
Definitely use Nexus. :P
I've used both Nexus and Artifactory. The interface for Nexus is a lot more robust, it's a lot more configurable, and of course, written by Sonatype, who repesents pretty much everything Maven well.
That being said, Artifactory is decent and workable.
A review of Nexus vs. Artifactory
Oh my! Of course, here's a SO quesiton about the matter.
Sonatype does a feature comparison
jFrog (maker of Artifactory) does a feature comparison
Use Artifactory.
I am using Artifactory myself, and love the user interface and ease of deployment/maintenance. That said, I have never used Nexus, and cannot really help you with a proper feature comparison.
Here are some things off the top of my head that I really like about Artifactory (keep in mind Nexus may have these features too):
Nice Web 2.0 interface.
The ability to import your local Maven repository to help get you started.
Ease of integration with existing LDAP servers for security (I'm a big fan of a single repository for storing credentials).
Given that there's really only two major Maven Repository implementation out there, if you really want to make sure you've made the right choice, I'd recommend trying both out, and deciding for yourself which you like better.
Perhaps this is obvious, but, for reproducibility, developers should never overwrite artifacts, they should be new versions.
This also applies to upstream repositories. If you download Apache-commons version 1.2.3, you should really never download it again. Fixes come from latter versions, not applied to existing versions.
Something else to consider:
http://archiva.apache.org/
As the the ORIGINAL QUESTION (technical issues to consider when constructing a M2 repository), I would recommend creating read-only user for browsing the repository and administrative user per administrator (that said: one read-only user for all those users that are not administrators).
Moreover, I would recommend generating backup images periodically (once a day perhaps ?). Very important both if your repository is big or you install your own artifacts from time to time.
Last, but not least, when adding new remote repositories, you must add inclusion/exclusion filters so an artifact lookup in the repository would be done more quickly.
There are lots of other issues to consider, but these are the leading issues I've encountered while managing a Maven internal repository.
For the record, I'm using both Nexus and Artifactory; I can clearly state that while Nexus is very simple and operative (though I sometimes have problems with the installation process on Ubuntu), its free version cannot compete with Artifactory's community (free) edition.
Excluding Artifactory's awesome web 2 UI, its main features, such as security management, periodic backups, and accessibility issues are way beyond those of Nexus.

How do you manage Hibernate's zillion JAR files

For my previous employer I've worked with Hibernate, and now that I'm in a small startup I would like to use it again. However, downloading both the Hibernate core and the Hibernate annotations distributions is rather painful, as it requires putting a lot of JAR files together. Because the JARs are split up into categories such as "required" and "optional" I would assume that every developer ends up with a different contents of his lib folder.
What is the common way to handle this problem? Basically I want to have a formal way to get all the JARs for Hibernate, so that (in theory) I would end up with exactly the same stuff if I would need again for another project next month.
Edit: I know roughly what Maven does, but I was wondering if there was another way to manage this sort of thing.
As Aaron has already mentioned, Maven is an option.
If you want something a bit more flexible you could use Apache Ant with Ivy.
Ivy is a dependency resolution tool which works in a similar way to Maven, you just define what libraries your project needs and it will go off and download all the dependencies for you.
Maybe this is not much of an answer, but I really don't see any problem with Hibernate dependencies. Along with hibernate3.jar, you need to have:
6 required jars, out of which commons-collections, dom4j and slf4j are more often used in other open-source projects
1 of either javassist or CGLIB jars
depending on cache and connection pooling, up to 2 jar files, which are pretty much Hibernate specific
So, at the very worst, you will have a maximum of 10 jars, Hibernate's own jar included. And out of those, only commons-collections, dom4j and slf4j will probably be used by some other library in your project. That is hardly a zillion, it can be managed easily, and surely does not warrant using an "elephant" like Maven.
I use Maven 2 and have it manage my dependencies for me.
One word of caution when considering using Maven or Ivy for managing dependencies is that the quality of the repository directly affects your build experience. If the repo is unavailable or the meta-data for the artifacts (pom.xml or ivy.xml) is incorrect you might not be able to build. Building your own local repository takes some work but is probably worth the effort. Ivy, for example, has an ANT task that will import artifacts from a Maven repository and publish them to you own Ivy repository. Once you have a local copy of the Maven repo, you can adjust the meta-data to fit what ever scheme you see fit to use. Sometimes the latest and greatest release is not in the public repository which can sometimes be an issue.
I assume you use the Hibernate APIs explicitly? Is it an option to use a standard API, let's say JPA, and let a J2EE container manage the implementation for you?
Otherwise, go with Maven or Ivy, depending on your current build system of choice.

Categories

Resources