Maven: How to split an existing maven project into different projects?

Maven: How to split an existing maven project into different projects? - java

So I have this big spring boot project with hundreds of APIs and corresponding models. I was asked to seperate it into three different modules. Storefront, Order Management System and Utilities. As per my basic, plan I sorted, filtered and moved Storefront and OMS APIs to their corresponding projects. I moved all the model classes into Utilities, created a package, added this package to the local repository and included it as dependencies of Storefront and OMS. Further Iexported these two projects as a runnable jar with copying required libraries into a sub folder next to the generated jar. And I did this because the sub folder will include the package for Utilities, and if in future, I have to update something from Utilities, I could just replace this package and restart the server.
Everything is working fine, the problem is with the size of the final package. The jar size of the original project is 175 MBs. All three projects have similar .pom files. So all three project export to a size which is almost 175 MBs. And as I said, I included the package for Utilities in other two projects. So the size of the sub folder for Storefront and OMS became around 350 MBs.
Finally, my question is, is there any way to split a maven project into 3 different sub projects which can be built and deployed independently and, is there any way these 3 projects share a set of libraries which can be stored remotely and accessed by them independently so to decrese the size of the final runnable jar?

I think there are some deeper issues involved. If your artifact has 175MB and most of this are dependencies, than you have very very many dependencies.
So first of all, you should ask yourself, if all of those dependencies are really necessary. It is not unusual that people add a dependency just to use one simple class from that dependency. And this dependency than has a transitive burden. 175MB really calls for a deeper analysis of this fact.
Next, if you see you cannot really reduce the dependencies any more, you can split the project into several ones (like you started to do). But, then most of the dependencies should be in just one of the resulting projects. If all of your resulting projects use all of the dependencies, then these projects are all doing similar, probably overlapping things, which is not good.

Related

Java project structure: module dependencies in sources vs binaries

I've seen projects where one single project divided into modules, and each module is maven project itself. These modules integrated through one module which contains references to all other modules. To launch project user must import all modules in IDE. So, why people using this approach? Isn't much easier to package all modules to jar and include as dependencies to some module? Is there any benefit to use projects instead jars? Drawbacks of using projects are: user needs to keep all modules in IDE, may accidentally change source code, and if IDE starts to compile all that modules it takes a lot of time.

If you accidentally change one file, only this file gets recompiled, that's not a big deal.
Usually, you need to modularize your project to cut interdependencies and make everything more controllable. Often, the parent project doesn't even have any source files of its own; instead, it is only used to aggregate the modules. Modules are just used here to separate a big project into pieces.
You could develop those pieces as separate projects, but to implement a change in one module and make it available for other modules that use it, you'd have to build that module and actualize a dependency in the client module. That's cumbersome. It's much more practical to keep them as one big project where you just change the code you need to change, and it's available to all the modules that depend on it.

Publish a bom from a multi-module-project

We are a large company with about 2000 separate Java projects. For historic reasons, we do not have multi-module projects, but we would like to introduce them.
Logically, we already have "groups" of projects, i.e. someone responsible for (say) 50 projects which are closely related. This someone regularly publishes a BOM which contains recent, coherent versions of these 50 projects.
Now it would make a lot of sense to grab these 50 projects and put them into one large multi-module project. Still, it would be necessary to publish a BOM because other projects (outside our group) should have coherent versions.
So, summarised, we need a BOM that contains the versions of all 50 projects that are part of the multi-module project. I wonder what would be the "Maven way" to create such a BOM. What I can think of:
The bom is the 51st project of the multi-module project. The versions of the dependencies are set by properties in the parent pom.
The bom is generated from the information present in the multi-module project and published as side artifact (this probably requires us to write a Maven plugin for this).
What would be advisable?

We are using BOMs as well for our multi-modules projects, but we are not tying their generation or update to the build of those modules.
A BOM is only updated when our release management process completes the delivery of a built module (or group of modules): once delivered, then the BOM is updated and pushed to Nexus (stored as a 1.0-SNAPSHOT version, constantly overridden after each delivery)
The BOM is then included within our POM (for mono or multi-module projects) and use for dependency management only, meaning our projects depends on artifact without the version: the dependency management from the BOM provides with the latest delivered version of other dependent modules.
In other words, we separate the build aspect (done here with maven) from the release part: the "bills of materials" represent what has been delivered, and ensure all projects are building with versions deemed working well together (since they have been delivered into production together).

I've never seen 2K of commercial Java projects, so will base my answer on how open source works:
Libraries shouldn't be grouped by people - they should be grouped by the problems that they solve. Often open source projects have multiple libs e.g. Jackson has jackson-databind, jackson-datatype-jsr310, etc. These libs tightly relate to each and may depend on each other.
Such groups shouldn't be too big. Some projects may have 1, others - 5 or 10. 50 libs in a group sounds way too much.
It's easier if libs in a group are released all at the same time (even if only one is updated). This makes it straightforward to keep track of versions in the apps that use multiple libs from a group.
There should be no dependencies between groups! And this is probably the most important rule. Deep hierarchy of libraries that depend on each other is not acceptable because now you need to keep compatibility between many projects and libs. This just doesn't scale. Which means there will be occasional copy-paste code between libs - this is the lesser evil.
There could be some exceptions to the last rule (maybe a lib that is used everywhere) but those must keep backward compatibility of the public API until there are no projects that depend on the old API. Such libs are very hard to maintain and it's better to opensource them.
Standalone projects now can depend on libraries from the same or different groups, but because the version within the group is the same, it's easy to set it as a property just once. Alternatively:
You can look at <scope>import</scope> which allows importing <dependencyManagement> sections from other POM files like parent POMs within a group (for some reason never worked for me).
Or at xxx-all modules - a module that depends on all other modules within group and thus when you depend on it, you also depend on others transitively.

Migrating complex project from Ant to Maven - How to handle unusual folder structures?

In my new project I am confronted with a complex infrastructure with several modules which have grown over the years in an unpleasant, uncontrolled way.
To come to the point: The build process is the horror. There are over 40 different, complex Ant files, which are connected multiple times and the SOA framework also generates several dynamic Ant files. It took a few days to really understand all the dependencies and to finally build the whole project without any errors.
My plan was or is to migrate the whole project from Ant to Maven, since new components are planned and I would like to avoid these problems in the future and well, because it is just horrible the way it is now ;-)
Since I am new to the migration of bigger projects, I am a little bit confused about the best workflow. There are dozens of XML files and scripts involved, which are distributed in a non-Maven directory structure. Overall there are over 3000 files involved. One of the main problems is that I don't know if I really should try to migrate everything in the known Maven directory structure and therefore risk endless editing and refactoring of every single file. Or should I keep the folder structure as it is and bloat my pom.xml files and possibly run into problems with all the different involved plugins? Honestly, both ways don't sound quite constructive.
Does it even make sense to migrate a project in this dimension to Maven? Especially when the SOA framework must use its own Ant files - therefore a combination of Ant and Maven would be necessary. What would be the best strategy to simplify this process?
Thanks for all suggestions.

Here's a simple and quick answer to Mavenizing an Ant project:
DON'T DO IT!
This is not some anti-Maven screed. I use Maven, and I like Maven. It forces developers not to do stupid things. Developers are terrible at writing build scripts. They want to do things this way and not the way everyone else does. Maven makes developers setup their projects in a way that everyone can understand.
The problem is that Ant allows developers to do wild and crazy things that you have to completely redo in Maven. It's more than just the directory structure. Ant allows for multiple build artifacts. Maven only allows for one per pom.xml1. What if your Ant project produces a half dozen different jar files -- and those jar files contain many of the same classes? You'll have to create a half dozen Maven projects just for the jars, and then another half dozen for the files that are in common between the jars.
I know because I did exactly this. The head of System Architecture decided that Maven is new and good while Ant must be bad and Evil. It didn't matter that the builds worked and were well structured. No, Ant must go, and Maven is the way.
The developers didn't want to do this, so it fell to me, the CM. I spent six months rewriting everything into Maven. We had WSLD, we had Hibernate, we had various frameworks, and somehow, I had to restructure everything to get it to work in Maven. I had to spawn new projects. I had to move directories around. I had to figure out new ways of doing things, all without stopping the developers from doing massive amounts of development.
This was the inner most circle of Hell.
One of the reasons why your Ant projects are so complex probably has to do with dependency management. If you are like our current shop, some developer decided to hack together develop their own system of dependency management. After seeing this dependency management system, I now know two things developers should never write: Their own build files, and dependency management systems.
Fortunately, there is an already existing dependency management system for Ant called Ivy. The nice thing about Ivy is that it works with the current Maven architecture. You can use your site's centralized Maven repository, and Ivy can deploy jars to that repository as Maven artifacts.
I created an Ivy project that automatically setup everything for the developers. It contained the necessary setup and configuration, and a few macros that could replace a few standard Ant tasks. I used svn:externals to attach this Ivy project to the main project.
Adding the project into the current build system wasn't too difficult:
I had to add in a few lines in the build.xml to integrate our ivy.dir project into the current project.
I had to define an ivy.xml file for that project.
I changed any instance of <jar and </jar> to <jar.macro and </jar.macro>. This macro did everything the standard <jar/> task did, but it also embedded the pom.xml in the jar just like Maven builds do. (Ivy has a task for converting the ivy.xml file into a pom.xml).
I Ripped out all the old dependency management crap that the other developer added. This could reduce a build.xml file by a hundred lines. I also ripped out all the stuff that did checkouts and commits, or ftp'd or scp'd stuff over. All of this stuff was for their Jenkins build system, but Jenkins can handle this without any help from the build files, thank you.
Add a few lines to integrate Ivy. The easiest way was to delete the jars in the lib directory, and then just download them via ivy.xml. All together, it might take a dozen lines of code to be added or changed in the build.xml to do this.
I got to the point where I could integrate Ivy into a project in a few hours -- if the build process itself wasn't too messed up. If I had to rewrite the build.xml from scratch, it might take me a two or three days.
Using Ivy cleaned up our Ant build process and allowed us many of the advantages we would have in Maven without having to take a complete restructuring.
By the way, the most helpful tool for this process is Beyond Compare. This allowed me to quickly verify that the new build process was compatible with the old.
Moving onto Maven Anyway...
The funny thing is that once you have integrated your Ant projects with Ivy, turning them into Maven projects isn't that difficult:
Clean up the logic in your build.xml. You might have to rewrite it from scratch, but without most of the dependency management garbage, it's not all that difficult.
Once the build.xml is cleaned up, start moving directories around until they match Maven's structure.
Change source to match the new directory structure. You may have a WAR that contains *css files in a non-standard location, and the code is hardwired to expect these files in that directory. You may have to change your Java code to match the new directory structure.
Break up Ant projects that build multiple projects into separate Ant projects that each builds a single artifact.
Add a pom.xml and delete the build.xml.
1 Yes, I know this isn't entirely true. There are Maven projects with sub-projects and super poms. But, you will never have a Maven project that builds four different unrelated jars while this is quite common in Ant.

I have done a similar migration in the past, and I had the same doubts you had; however, I went for the "keep the folder structure intact and specify the paths in the POM files" way and I noticed it wasn't as bad as I thought.
What I actually had to do was to appropriately set the <sourceDirectory> and the <outputDirectory>and maybe add some inclusion and exclusion filters, but in the end I'd say that even if Maven's way is really convention-over-configuration-ish and makes your life easier if you follow its directives on where to place files, it doesn't really make it much harder if you don't.
Besides, something that really helped me when migrating was the possibility to divide the Maven project in modules, which I initially used to replicate the Ant structure (i.e. I had one Maven module for each build.xml file) making the first stage of the migration simpler, and then I changed the module aggregation to make it more meaningful and more Maven-like.
Not sure if this does actually make any sense to you, since I didn't have any generated Ant files which I recon may be the biggest issue for you, but I would definitely follow this road again instead of refactoring and moving files everywhere to Mavenize my project structure.

Is there a dynamic java class level Ivy-like resolver?

This is more a question about what's out there, and future directions about resolving tools such as Ivy. Is there anything that can mention class-level dependencies for packages, rather than package level dependencies?
For example, let's say I have an apache-xyxy package, that comes with an ivy.xml that lists all it's dependencies. But suppose I only use class WX in apache-xyxy, which doesn't require most of those dependencies. Couldn't a resolver be intelligent and identify that class WX can only possibly invoke the set of other classes (AB, DC, EF), and none of those classes use any of other dependencies, to create a minimal subset of required dependencies? This would be easier and safer than cherry picking to remove some package dependencies that aren't needed because of the specific classes used in that package, and also prevent breaking down several larger packages into smaller ones just for this reason.
Then, if I later decided to use class GH from apache-xyxy, I could do an ivy resolve, and it would dynamically bring in the additional required libraries.

When packaging compiled java code for distribution it's common practice to bundle Java "packages" together. It's also quite possible (but silly) to split a java package across multiple jars. Large frameworks (like Spring) have lots of sub packages in different jars so that users can pick and choose what they need at run-time..... Of course the more jar options one has, the more complex it becomes to populate the run-time classpath...
The keyword here is "run-time".... Tools like Apache ivy and Apache Maven are primarily designed to manage dependencies needed at build time....
Apache Maven does have a "runtime" scope, for it's dependencies, but it's limited to a single list of jars. Typically this scope is used for deciding which jars are needed for testing and populating the lib directory of a WAR file.
Apache ivy has a similar more flexible mechanism called "configurations". It's possible to create as many runtime configurations as you need, and these can be used to decide which jars are downloaded by ivy.
So while it would appear ivy has the answer, I've rarely seen ivy used when launching programs (The one exception is Groovy's Grape annotations)
So what, you might ask, is the answer?
The future of "run-time" classpath management is either OSGI or project jigsaw. I'm more familiar with OSGI where special dependency indicators are added the the jar file's manifest, stating what it's dependencies are. The idea is that when a container loads a jar (called a "bundle") it can check and see whether the other dependencies are already loaded. These dependencies can be retrieved and loaded from a common repository. This is fundentally different way to launch java. Traditionally each application is loaded onto it's own isolated classpath.....
Time will tell if either project catches on. In the meantime we use Apache ivy and Apache Maven to build self-contained and possibly over-bloated WAR (EAR, etc) packages.

Share entities between multi-projects

I have 3 Java projects with the same entities.
I want to share entities between these projects because entities can evolve during the development phase.
We are thinking about building a jar with entities and sharing it using Maven (with a repository).
Maybe you have another solution ?

I also can recommend to use Maven to share code between projects.
Here are some tips to get started:
Use a Maven Repository Manager such as Nexus. It will help you to
create a stable development environment.
Every developer (also the Continuous Integration Server user) should configure their settings file to use your Maven Repository
Manager. Don't specify your repositories in the POMs, confiugre them
only in your Maven Repository Manager.
http://www.sonatype.com/books/nexus-book/reference/maven-sect-single-group.html
Use the dependencyManagement and pluginManagement elements of your parent POMs to specify all versions of the plugins and dependencies
you are using. Omit these versions in the other POMs (they will
inherit them from the parent POM).
I also recommend to use different POMs for multi-module builds and parent POMs.

If you want to share common interfaces, classes, functionality or components, Maven is the way to go. In addition to the dependency management, you also get the added bonus of a standard project layout that will simplify things. Easy integration with most common continuous integration servers and a standard release process are further benefits.
Definitely take a look at Maven!

making an own JAR-library is definitely a good solution.
The jar-file is easy to distribute via dependency management (maven, ivy, gradle ..)
The jar is versioned
The projects using the library can be tested against a certain verion. Otherwise it may gets a problem if you change enties and forget to change a depending project. -> integration tests
Regards

Entities are the representation of a given object am I correct? If so the default mechanism implemented by Java is Object serialization - http://en.wikipedia.org/wiki/Serialization. In the case of jar files if an entity changes you would have to change jar once again each time as well. It may be tedious.

Geneate a standard war file in roo.. But then change it's package to jar file.
Then from any standard war file you can just deploy this jar (Ill use the jar as a maven dependency). Ill maintain a unique named applicationConext like pizzaShop-applicationContext.xml and like pizzaShop-applicationContext-jpa.xml. so from a parent spring project I can stack up various roo projects in this fashion.
Ill also keep their generated webapps folder to allow for the generator to work more easily. (This means I have to open up the pom.xml and keep changing it back to jar). Also helps with cut and paste fodder for non roo generated war files web.xml entry additions.
Seems like it may be a confusing point about roo.. You can just mix and match these jars as you would any spring project. They function like self contained units of springness and work fine sitting side by side with other spring jars all under the same webapp/web.xml context.
Its tedious but still better then writing spring code by hand.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.