Maven and Refactoring across Multiple Projects

Maven and Refactoring across Multiple Projects - java

I'm currently trying to find a suitable workflow to execute refactorings across multiple Maven projects, but I cannot find any satisfying solution.
Suppose there are three projects. One project called common and two dependent projects called app1 and app2. Each on a separate Git repository.
Now suppose that a method in common called StringUtils.trim() shall be renamed to StringUtils.getTrimed(). Because this method is used by app1 and app2, those projects need to be adapted, too.
Now, the question is, what is the correct workflow to provide this modification to the development team. Given this situation: A team of about 10 developers with a central SCM server and a central Maven repository server to host the artifacts.
Possible work-flows:
Just commit the changes to the SCM (for all three projects)
--> Problem: Developers working on app1 for example probably don't have project common checked out, but are using a binary from the central artifact server. When they fetch the newest version of app1 from the SCM, their build will break. BAD!
Additionally to 1, change the dependency of app1 to point to the SNAPSHOT version of common.
--> Problem: When the developers working on app1 fetch the newest version from the SCM, they probably won't get the latest SNAPSHOT of common, if the update policy is set to DAILY (which is the default). BAD!
Additionally to 2, change the update policy for SNAPSHOTS to ALWAYS.
--> Problem: Now, the developers of app1 will get the latest SNAPSHOT of common and everything is fine, but only if the SNAPSHOT is deployed to the artifact server before I commit the changes to app1! Furthermore, there is a time period, between the deployment of the common project to the artifact server and my commit of app1 to the SCM. If the developers fetch the latest SNAPSHOT of common, it will be incompatible with the current version of app1 in the SCM, since I have not committed the changes to app1 yet. BAD!
The only workable solution I can come up with is to skip the SNAPSHOT mechanism and work with versions. Here is the work-flow:
After changing the three projects locally on my machine (before any commit), increase the version of common from say 1.0 to 1.1.
Commit common to the SCM and deploy it to the artifact server.
Only after the deployment is done, change the dependencies in app1 and app2 to point to the new version of common and commit app1 and app2.
Problems with this approach: I have to do version management, so this can only be done in coordination with the whole team and considering all dependent projects. Additionally, I have to wait for the common project to be deployed to the artifact server before I can commit my changes to app1 and app2.
Isn't there any easy and flexible mechanism, how to execute such refactorings? I hope that someone can help me. Maybe there is also some misconception on my side.

I don't think you will find an easy way of doing refactorings if they are of the type that you suggest in your example. What you basically have here is a third-party dependency from app 1 & 2 to common. And, as you might have noticed, third party dependencies do not change the name of their interface methods very often, for a good reason - it would be hell for anyone trying to update to the latest version.
I would suggest that you keep the names of your interface methods to a as high degree as possible, there should be something terribly wrong with the name in order for it to change (the name states that it does something it doesn't or vica versa). Wanting to change "trim()" to "getTrimmed()" is not good enough reason to do so.
So, my suggestion would be this:
Do all the refactoring that you want in the internal parts of common, keep the interface as is. If you really have to rename something, make a duplicate of the method in question with the new name, mark the old one as deprecated, and keep them both alive for a period of time, until it is reasonable to assume that everyone has started using the new one, and stopped using the old.

I would recommend you an OpenSource tool like Sonar (sonarsource.org)

Depreciate the old method names and refer the calls to the new method names. Then you have a time window to work with both. Just then ask the development teams to check for depreciated methods every X and after a fixed time drop the depreciated in a future release.

Yes. Use versioning.
For minor version upgrades, keep the old method in place and mark it as deprecated.
Document your changes to the interface in the release notes of the common project.

Related

Best practice for deploying different versions of project for different other projects

We have many Maven projects, and one of these projects is included as a dependency in every other project. The problem is, that when deploying a new version of this dependency used by the others, every project gets this new version, which could lead to problems.
Of course, I could manually change the version every time I deploy the project, but that could lead to problems as well, i.e. when forgetting to change the version before deploying it.
I also saw the solution of using a ${version} placeholder in the "version"-tag, but that would mean, that I have to specify the version every time I'm doing a Maven command.
Is there a solution for such problems, where you have a dependency used in many other projects and need a different version in everyone of these projects?

The first thing I see
The problem is, that when deploying a new version of this dependency
used by the others, every project gets this new version, which could
lead to problems.
This shows a big issue. You seemed to be violating the foundational rule of immutable releases.
In consequence, the version is in the end useless because it's always the same and does not transport any kind of information.
There is first to say you should follow semantic versioning.
Also you should use either maven-release-plugin to increment the numbers automatically (but that will not solve the issue minor or major release), there are tools to identify such things). This should be solved by using a CI/CD setup (Jenkins, etc.).
Tools to check changes (compatibility) are things like RevAPI or
JAPI-Checker etc. Also some useful information here.
Furthermore you can do that via different setups The mentioned ${version} is simply wrong and will not work in several ways.
Upgrading a larger number of projects can be done by using something like Renovate or things like Dependabot or even some existing Maven plugins which can be run via CI/CD automatically (scheduled) which you should even do for security scans etc. That means automation is the keyword here.

tools for patch dependency

I have such a problem：
When customer asked for a new feature of our system,we patched all the changed files into a patch and send it to a workmate who work with customer and do the release.
However,not every patch released in order.So,there is a chance patch A is depending patch B but patch B is releasing in front of patch A.
Due to the workmate is not familiar with programming so he can't figure out the reason.I have to spend time to see what's wrong.
When the number of patch waiting for release is growing up,releasing a patch seem like a nightmare for us.
Is there a tool for such dependency analysis?so we can see the dependency of patch and can reduce the time spending for figure out the dependency.
Thx a lot.

You can do it using any dependency management system like Maven, internal artifact repository for production-ready components (like Nexus) and release branches for hot-fixes (if you have to ship updates in a near realtime way).
Using this approach you get:
Testing of any complete release (including integration tests) which you ship to a customer.
Dependency-broken versions are not to build, because you can switch to production-ready repository for pre-ship build.
You can mark production packages with SCM tags and know what exactly is pushed to a customer.
You can simply make a diff between shipped and current package.
In few words:
Divide development and production releases and you protect yourself to build a dependency broken production package.

the correct way to handle these situation is using a Configuration Management procedure.
a simple one, for example, involves a CVS/SVN and a changelog between revision A and revision B.
every file in the changelog will compose the patch.
a more complex procedure will introduce baselines and intermediate releases.

When to run my own maven repository?

I'm working on a small project with one other developer. We are using libraries that are all available in public maven repositories. We have a single, multi-module maven project but build all of the modules every time in our build server and developer machines. The built artifacts are available on the build server for deployment.
Is there any reason for us to setup a maven repository manager? It would run on the build machine, which I would have to connect to via VPN and is slower than downloading from the internet directly.

The best reason to run a repo manager is to achieve reliable behavior in the face of maven's treatment of <repository> declarations.
If any dependency you reference, or plugin you use, or dependency of a plugin you use, declares any repositories, maven will start searching for everything in those repositories. And, if they are unreliable or slow, there goes your build. While current policy is to forbid repository declarations in central, there are plenty of historical poms pointing every-which way.
The only cure for this is to set up a repository manager and declare it to be a 'mirrorOf' '*' in settings.xml. Then, put the necessary rules in your repository manager to search only the places you want to search.
Of course, if there are more than one of you and you want to share snapshots or releases, you'll really want a repo manager, too. That's the obvious reason.
If I turn off my repo manager and just try to build some of my projects that have the bad luck to have plugin/dependencies with repository declarations in their poms, they fail. With the repo manager, all is well.

If you have to ask, probably not.
But it's a pretty cheap date - Nexus is pretty trivial to setup, I don't know about archiva.
Public servers tend to be unreliable.
A local mirror will be faster.
You don't want to get hung up because you cleaned your local repo and can't get something off the public one.
If and when you get into needing to work with multiple versions of your app, I'd start thinking about it harder.

In my opinion, it only becomes a good reason when the projects and the teams working on them grows more and more, specially because the 'versioning' capabilities of it.

My answer is, that you will have to create an own repository as soon as you will need a jar, which is not in the public repositories (not so hard, since for example some JDBC drivers are not hosted in public repo's for licensing issues). If you don't do so, you will loose the main advantage of Maven, that your colleagues do not have to do anything to get the proper libraries. If you introduce a new library, you do not need to say anything to the colleagues if you have company repository, but if not, you need to send the library for them and you need to tell them, how to install the library locally.

When should mvn release be used in the project lifecycle?

To clarify the question :
I am looking for established best-practices or a pro/con analysis of known practices
by project lifecycle I mean : deploy to pre-integration, integration, QA, preprod and prod environment.
For some context:
Our project deploys to integration and QA every week, currenlty we create a new release for each integration deployment, but this doesn't feel right. It leads to updating all the poms every week breaking dev level dependencies, forcing every dev to do a refresh of their eclipse configurations. We have large workspaces and eclipse doesn't handle the refreshes so well thus a lot of wasted time.
I am not overly familiar with the maven release conventions and have been unable to find the ones regarding the point of the application lifecycle when mvn release should be used.
If the pattern we use now is accepted/correct/established I will have another question :)

The approach I use to avoid the Eclipse dev level dependency update issue is to leave the relevant trunk or branch version number unchanged until such time as the release becomes significant. This way you can have properly tagged/versioned releases to QA etc so that you can track issues back but not require devs to update dependencies. To achieve this I use the following command but override the version numbers to get the desired release number but re-enter the current snapshot version as the new snapshot version:
mvn release:prepare -DautoVersionSubmodules=true
P.S. I have a diagram that demonstrates this but unfortunately insufficient rights in this forum to attach it. I would happily provide it if someone can facilitate attaching.
P.P.S Maybe now...
Note also the support for early branching (2.1) and late branching (2.2).

In our shop, all of our POMs in SVN have <version>9999-SNAPSHOT</version> (for their own version as well as internal dependencies). This never changes.
During the build, we have a simple ant build.xml that takes the version number (established outside of maven) as a -Dversion=... parameter, and simply does:
<replace includes="**/pom.xml" token="9999-SNAPSHOT" value="${version}"/>
<artifact:mvn ... />
That change is local to the build process's working copy -- it's never checked in to version control.
This way all release builds have a "real" version number, but dev effectively never has to deal with version numbers.
The above is, as you say in your question, emphatically not The Right Way to do this, but it has worked well for us for the ~9 mos since we adopted maven. We have tens of maven modules, all of which move in lock-step through the QA/release process.
One implication of this approach is that you'll need separate eclipse workspaces for each branch you're working on, as otherwise the copies of a project from dif't branches will collide.

[Not really an answer, but the best I have...]
Related to MNG-624.
Depending on how many projects you have, even the burden on your source-control system may be an issue.
Does anyone use an independent numbering scheme with Maven snapshots to avoid version-number churning? In theory, you could do what you'd do without Maven - use an internal numbering system of some kind for the weekly builds. The builds would be deployed to different repositories as dictated by workflow; you'll need separate repositories for dev, QA, maybe one in between for integration test. When you're down to release candidates, start using non-snapshot releases. I'm just evaluating Maven, though - I've no experience with doing this.
Some of the Nexus documentation (for the Professional version) talks about how to do build staging, which may be relevant.

In the past I used a numbering scheme of my own devising: http://wiki.secondlife.com/wiki/Codeticket_Service
I'm now in the situation where I need to think about maven again, and I'm tempted to re-use the codeticket scheme to generate version numbers/build numbers and apply them via the release plugin but without checking the pom files back in. The checked in pom files will keep the SNAPSHOT version numbers.
For those who care about reproducible builds, you can include the modified POM file in your build result. Personally, I care more about tracking the build artifacts and ensuring that the exact same bits that have been tested end up getting released, so my concern about reproducing builds is slightly less religious than with most (See here).

There is a discussion going on in the maven users list (in which I'm participating) that seems relevant. Basically we're discussing how to avoid all that POM editing that has to be done whenever you cut a (release or feature) branch. The release plugin can do the editing for you, when you create a release branch, but it does not help with feature branches that need to be reintegrated later. Also, all that POM editing causes unecessary pain when you do merges, either rebase merges from trunk or reintegration merges to trunk.
The idea being discussed there is based on the notion that the proper location to record artifact version numbers is in the SCM tool and not in the POM. Basically, maven should be able to derive the artifact version number from the actual SCM tag or branch that the working area is associated to.
Note that there is not a complete solution yet due to some issues still pending on Maven's issue tracker (e.g. MNG-2971). But they are issues with many votes already and I'm optimist they will be fixed soon.

Tips for maintaining an internal Maven Repository?

I'm interested in maintaining a Maven 2 repository for my organization. What are the some of the pointers and pitfalls that would help.
What are guidelines for users to follow when setting up standards for downloading from or publishing their own artifacts to the repository when releasing their code? What kinds of governance/rules do you have in place for this type of thing? What do you include about it in your developer's guide/documentation?
UPDATE: We've stood up Nexus and have been very happy with it - followed most of Sal's guidelines and haven't had any trouble. In addition, we've restricted deploy access and automated build/deployment of snapshot artifacts through a Hudson CI server. Hudson can analyze all of the upstream/downstream project dependencies, so if a compilation problem, test failure, or some other violation causes the build to break, no deployment will occur. Be weary of doing snapshot deployments in Maven2/Maven3, as the metadata has changed between the two versions. The "Hudson only" snapshot deployment strategy will mitigate this. We do not use the Release Plugin, but have written some plumbing around the Versions plugin when going to move a snapshot to release. We also use m2eclipse and it seems to work very well with Nexus, as from the settings file it can see Nexus and knows to index artifact information for lookup from there. (Though I have had to tweak some of those settings to have it fully index our internal snapshots.) I'd also recommend you deploy a source jar with your artifacts as a standard practice if you're interested in doing this. We configure that in a super POM.
I've come across this Sonatype whitepaper which details different stages of adoption/maturity, each with different usage goals for a Maven Repository manager.

I would recommend setting up one nexus server with at least four repositories. I would not recommend artifactory. The free version of nexus is perfectly fine for a dev team of less than 20 in less than three groups. If you have more users than that, do yourself a favor and pay for the Sonatype release. The LDAP integration pays for itself.
Internal Release
Internal Snapshot
Internal 3rd Party for code used in house that comes from outside sources, or for endorsed 3rd party versions. Put the JDBC drivers, javax.* stuff and stuff from clients and partners here.
External Proxies common proxy for all the usual sources like m2, codehaus etc
Configure Nexus to do the following for internal repos
Delete old Snapshots on regular intervals
Delete Snapshots on release
Build index files. This speeds up local builds too
Have a common settings.xml file that uses these four and only these four sources. If you need to customize beyond this try to keep a common part of the settings file and use profiles for the differences. Do not let your clients just roll their own settings or you will end up with code that builds on one machine but not on any other machine.
Provide a common proxy for your clients. In Nexus, you can add a bunch of proxies to the common Maven sources (Apache, JBoss, Codehaus) and have a single proxy exposed to the internal clients. This makes adding and removing sources from your clients much easier.
Don't mix Internal and 3rd party artifacts in the same repository. Nexus allows you to add jars to an internal repository via a web gui. I recommend this as the way of adding your JDBC drivers and other external code to 3rd party. The UI is quite nice to use when compared to most enterprise software.
Define a common parent POM that defines the Internal snapshot and release repos via the distributionManagement tag. I know lots of people tell you not to do this. And while I freely admit that there are all kinds of problems with doing this, it works out OK if the clients will only be building releases and snapshots to be deployed to a single internal repository.
If you have an existing mis-managed Maven repository, create a 5th repos called Legacy and put the whole repos there. Set up a cron task to delete old files from legacy once they are a year old. That gives everyone a year to move off of it and update their poms.
Establish an easy to stick to naming convention for internal artifacts. I prefer GroupID of Department.Function.Project and an ArtifactId for that componentName. For internal repositories, com/org/net and the company name are likely to be irrelevant. And wrong if the company changes its name. It is far less likely that the sales, accounting or inventory department will be renamed.

Definitely use Nexus. :P
I've used both Nexus and Artifactory. The interface for Nexus is a lot more robust, it's a lot more configurable, and of course, written by Sonatype, who repesents pretty much everything Maven well.
That being said, Artifactory is decent and workable.
A review of Nexus vs. Artifactory
Oh my! Of course, here's a SO quesiton about the matter.
Sonatype does a feature comparison
jFrog (maker of Artifactory) does a feature comparison

Use Artifactory.

I am using Artifactory myself, and love the user interface and ease of deployment/maintenance. That said, I have never used Nexus, and cannot really help you with a proper feature comparison.
Here are some things off the top of my head that I really like about Artifactory (keep in mind Nexus may have these features too):
Nice Web 2.0 interface.
The ability to import your local Maven repository to help get you started.
Ease of integration with existing LDAP servers for security (I'm a big fan of a single repository for storing credentials).
Given that there's really only two major Maven Repository implementation out there, if you really want to make sure you've made the right choice, I'd recommend trying both out, and deciding for yourself which you like better.

Perhaps this is obvious, but, for reproducibility, developers should never overwrite artifacts, they should be new versions.
This also applies to upstream repositories. If you download Apache-commons version 1.2.3, you should really never download it again. Fixes come from latter versions, not applied to existing versions.

Something else to consider:
http://archiva.apache.org/

As the the ORIGINAL QUESTION (technical issues to consider when constructing a M2 repository), I would recommend creating read-only user for browsing the repository and administrative user per administrator (that said: one read-only user for all those users that are not administrators).
Moreover, I would recommend generating backup images periodically (once a day perhaps ?). Very important both if your repository is big or you install your own artifacts from time to time.
Last, but not least, when adding new remote repositories, you must add inclusion/exclusion filters so an artifact lookup in the repository would be done more quickly.
There are lots of other issues to consider, but these are the leading issues I've encountered while managing a Maven internal repository.
For the record, I'm using both Nexus and Artifactory; I can clearly state that while Nexus is very simple and operative (though I sometimes have problems with the installation process on Ubuntu), its free version cannot compete with Artifactory's community (free) edition.
Excluding Artifactory's awesome web 2 UI, its main features, such as security management, periodic backups, and accessibility issues are way beyond those of Nexus.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.