How to aggregate maven subproject javadoc output without regenerating javadoc - java

I have a largish multimodule Maven build. I need to generate the javadoc for all of the modules and produce an "aggregated" javadoc result that I can deploy to a box for consumption by users.
I did have this working perfectly fine for quite a while, until I tried implementing a custom taglet with specific features and requirements, which makes this more complicated to produce.
All of the submodules inherit a parent pom that is not the aggregator pom. In that parent pom I define the maven-javadoc-plugin. This is what it looked like before I added the custom taglet:
<bottom>Unified Service Layer - bottom</bottom>
<doctitle>Unified Service Layer - title</doctitle>
<footer>Unified Service Layer - footer</footer>
<header>Unified Service Layer - header</header>
<packagesheader>Unified Service Layer - packagesheader</packagesheader>
<top>Unified Server Layer - top</top>
<windowtitle>Unified Service Layer - windowtitle</windowtitle>
With this, I could build all all of the modules, which will generate their own javadoc (which I now know is just a validation step, as aggregate-jar doesn't use this output). I have a separate step I call from jenkins that runs "javadoc:aggregate-jar" in the root project, which produces the aggregated javadoc jar that I deploy.
Again, this has been working fine until now.
I implemented a custom javadoc taglet which requires getting access to the Class object associated with the source file it is contained within. I got this to work, at least in the individual module builds by adding the following to the configuration above:
In order to have the taglet get access to the class file, I had to add a minimal plugin configuration to each subproject pom.xml, which looks like this:
<tagletArtifacts combine.children="append">
With these minimal changes, I could run the build in each module, generating the javadoc, and examining the generated javadoc output in each module, verifying that it all worked.
However, the problem is, when I run "javadoc:aggregate-jar" in the root project, all of that already built output is ignored. It reruns the javadoc generation for all of the subprojects, also ignoring the appended tagletArtifacts list in each subproject pom.xml file. As a result, I get ClassNotFound errors when it tries to get the class file.
I could "fix" this by putting all of the subproject GAVs into the top-level "tagletArtifacts" list, but I definitely do not want to do that. I liked the ability to specify this in the subproject pom.xml (with combine.children="append") to make it work.
What I need is an overall javadoc package for all of the subprojects, with the taglet able to get access to the class file, without forcing the parent pom to know about all of its subprojects. How can I do this?

I'm facing the same problem with all aggregate goals. I checked the source code to maven-javadoc-plugin and it turns out that aggregate work by traversing submodules and collecting source files and nothing more, thus completely ignoring any form configurations specified in the submodules.
During execution every submodule is completely ignored:
if ( isAggregator() && !project.isExecutionRoot() ) {
And during collection of source files submodules are traversed: source
if ( isAggregator() && project.isExecutionRoot() ) {
for ( MavenProject subProject : reactorProjects ) {
if ( subProject != project ) {
List<String> sourceRoots = getProjectSourceRoots( subProject );
So at the moment, there is no way to do this.
This is not easy to fix either since the whole plugin works by composing a single call to the actual javadoc tool. If you would like to respect settings in the submodules as well, you'll have to merge the configuration blocks of them. While this would work in your case with tagletArtifacts, it does not work for all the settings you can specify, e.g. any form of filter, and can therefore not be done in a generic way.


How do I reference my lambda from code in AWS Cloud Development Kit?

Function helloLambda = new Function(helloStack, "hellocdkworld123", FunctionProps.builder()
.code(Code.fromAsset("target/cdkhello-0.1.jar")) // <- x ?
.handler("com.myorg.functions.HelloLambda::sayHello") <- y?
There is also a possibility to reference it by S3 bucket. But when I run cdk bootstrap I get a generated bucket with generated name of the jar file. How should I be able to reference that before hand from code? Of course now I could write the exact bucket + file but then purpose of defining it from code is lost right?
First of all, assuming that the method that you want to execute when the Lambda is invoked is sayHello, from the com.myorg.functions.HelloLambda class, then that part of your solution is correct. The more difficult part is actually accessing the JAR with your Lambda code in it.
NOTE: I've updated my original answer with what I think is a better way to accomplish this. In order to avoid confusion and making this answer too wordy, I've removed the original answer, though much of it is common with this one. I credit this answer for helping to improve this answer.
Pass the path to the dependent resource's JAR to CDK
Create a new property for the full path to your Lambda JAR.
Associate dependency and execution related goals into the package phase of the build.
Update cdk.json to point to the the package phase.
Pass the full path via a system property to your CDK code.
Use the System property to pass to Code.asset(...).
I've separated out the Lambda and the CDK infrastructure code into separate Maven modules. The intention being that once the Lambda code is compiled, packaged up into an uber JAR (its code plus all of its dependencies' code), the infrastructure module can refer to it as a dependency, passing the full path to the Lambda JAR to the App/Stack class to that it can use it as an asset.
Create a new property for the full path to your Lambda JAR.
In the properties section of your pom.xml, create a new property to refer to your Lambda JAR. Something like this:
Populate a property with the full path to your Lambda dependency's JAR, using the dependency plugin.
This associates the properties goal with the process-resources phase. Whenever that phase of the build occurs, the property you've created previously will be populated with the full path to the JAR in your local repository.
Associate dependency and execution related goals into a single phase of the build.
When you create a new CDK Java project, it outputs a file called cdk.json, which points by default to the Maven exec:java goal. In order for your new lambda.jar property to be populated correctly, you need to associate the exec:java goal with the same phase as above.
In order for your code to get access to the JAR file that you've generated, you need to create a System property (I couldn't get environment variables to work) to your App class. Your pom.xml started with something like this:
Pass the full path via a system property to your CDK code.
In the configuration section (after mainClass), add a system property for your assets directory, something like this:
Update cdk.json to point to the the common phase you've used.
Your cdk.json of your CDK project should be changed to point to the process-resources phase. Once done it will look like this:
"app": "mvn package"
It will cause both the goals to be run in succession, and upon execution the path to your Lambda's JAR will be passed as a system property.
Access the property from your App/Stack code.
Finally, now that the system property is created, you can access it from your code by calling System.getProperty("lambda.jar"). Something like this:
final Code code = Code.fromAsset(System.getProperty("lambda.jar"));
You can then use the code reference wherever needed when defining your Lambda functions.

Import path in Java, Maven

Following the tutorial about Kafka Streams located at:
There is a line:
import io.confluent.examples.streams.avro.WikiFeed
As I suppose it relates to this file:
How does Maven knows it is in resource not java folder?
Why io/confluent/examples/streams/avro/wikifeed.avsc instead of avro/io/confluent/examples/streams/wikifeed.avsc?
The other import is even more fantastic:
import io.confluent.kafka.serializers.AbstractKafkaAvroSerDeConfig;
There is no kafka folder in the java/io/confluent folder.
How does all this magic suppose to work?
The magic is made by avro-maven-plugin which you can find in the pom.xml:
Quoting from the documentation of the plugin:
Simple integration with dynamic languages. Code generation is not required to read or write data files nor to use or implement RPC protocols. Code generation as an optional optimization, only worth implementing for statically typed languages.
This is, at pre compile time, the plugin reads the content of avsc files and generate binary sources (for this case, Java classes) that then can be used in the code.
You can see the code generated by the plugin in target/generated-sources. There will be a folder structure and proper java (not class) files there.
The WikiFeed class is created dynamically at build time using the avro-maven-plugin from the .avsc file you linked to. You can check how it's configured in the <plugins> section of pom.xml.
The AbstractKafkaAvroSerDeConfig class comes from the kafka-avro-serializer dependency. Eclipse has a nice way of navigating from the individual class in the Editor view back to the Package Explorer which includes the Maven dependencies, like this:

jacoco only shows coverage for classes in the same module

I have a somewhat large multi-module Maven project. I have the unit tests in each module being processed by Jacoco. I have a separate child module doing "merge" and "report-aggregate", and this appears to be generating data. I'm even using the generated data in SonarQube. Most of my tests are using PowerMock, and I'm using offline instrumentation.
However, after looking closer at the coverage data, I see that it is leaving out coverage data for classes and methods that I know are being executed during tests. The pattern I see in every module is that it only reports coverage for a single class in each module, which is a class actually in the current module. Almost all of the tests also call out to other classes in other modules in the build, and coverage for those classes are never reported.
The following plugin configurations are in the parent pom used by each child module:
When I inspect the generated HTML results for each module, I find that it only reports results for the single class in the current module, and not the data for classes in other modules. From this, I would assume that how I do "merge" and "report-aggregate" in the separate child module is probably irrelevant to this problem.
The generated "jacoco.exec" file is binary, but I tried "catting" out one from one module just to see what ascii text was visible, and it showed only one occurrence of anything that looked like a file name, and it was the only file name reported in the HTML coverage report for that module.
I'm not sure what other information I can report.
I guess I can see pretty clearly now that when surefire runs unit tests, it uses the instrumented classes from the current module, but the uninstrumented classes from the maven artifacts. This is why I only see coverage for classes in the current module.
So it seems like I need a way to specify that the "target/generated-classes/jacoco" folder for each module the current module depends on, is prepended to the classpath that surefire uses. I don't see a way to do that.
Alternatively, I see that the "instrument" goal has an "includes" configuration element. Should I be specifying paths to all of the "target/classes" directories for each of the modules that the current module depends on?
Recording of code coverage for some class requires its instrumentation. Goal instrument performs instrumentation of classes of current module.
all of the tests also call out to other classes in other modules
so the ones that are not instrumented. And if I correctly understood, then exactly those for which you are missing coverage.
If you don't use PowerMock for classes that come from other modules, but only for classes in current module, then you can combine offline instrumentation with on-the-fly using agent. But in this case make sure that classes instrumented offline are explicitly excluded from instrumentation by agent, otherwise agent will be throwing IllegalStateException: Class ... is already instrumented.
If you use PowerMock for classes that come from other modules, then this becomes more complex due to strictness of Maven in regards of manipulations with classpaths and dependencies. And I doubt that this can be easily achieved using one mvn comand, however seems possible using more:
instrument and run tests, but don't use restore-instrumented-classes
restore classes and generate report(s)
Unfortunately you haven't provided complete example ( and I don't have time to prepare full example to test this approach right now.
As a side note: inability to simply use agent comes from the fact that PowerMock bypasses any agent and reads class files from disk.

Post-process jar after assembly but before installation (to get idempotent builds)

We use Jenkins which use md5 fingerprinting to identify artifacts and whether the artifact has changed since the last build. Unfortunately Maven builds always generate binary different artifacts.
Therefore I am looking into making Maven generate the same jar artifact for the same set of input files regardless of where and when they were built, which amongst other things mean that the entries in the jar file must be sorted - not only in the index, but in the order they are written to the jar file.
After examining maven-jar-plugin which use maven-assembly-plugin, my conclusions are that they do not collect all files to be written in memory before writing them all at once, but write one at a time. This mean that it may be better to postprocess the generated jar instead of changing the current behavior so I at that time can sort the entries, zero the timestamps, etc.
I am unfamiliar with writing Maven plugins, so my question is, how should I write a plugin which Maven knows how to tell where the artifact-jar-in-progress is located and how I hook it up in my pom.xml?
(At first I need this to work for jar files, but war files would be nice too).
As mentioned, this can be done based on something similar to maven-shade-plugin. I went ahead and wrote a simple plugin to add this capability -- see (available on the Central repo).
The behavior is based on the shade plugin. It consists of a single goal called normalize which can be bound to the package lifecycle phase and configured in the project's POM:
A few notes about the plugin:
The artifact under build is accessed via project#getArtifact() where project is a org.apache.maven.project.MavenProject.
Normalization consists of mainly three steps:
Setting the last modified time of all Jar entries to a specific timestamp (default value is 1970-01-01 00:00:00AM but can be changed via -Dtimestamp system property).
Reordering (alphabetically) of attributes in the manifest except for Manifest-Version which always comes first.
Removing comments from the file which contain a timestamp that causes the Jar to differ from one build to another.
Once invoked, the goal will generate the output file next to the original artifact (named artifactId-version-normalized.jar), i.e. in the directory.
To create maven plugin project
mvn archetype:generate \
-DgroupId=sample.plugin \
-DartifactId=hello-maven-plugin \
-DarchetypeGroupId=org.apache.maven.archetypes \
invoke this command it will generate a skeleton project with a class called
write your stuff inside execute() method, and install that plugin to your repository by mvn clean install
then attach its execution with your project, in your project pom.xml
to access project properties inside your Mojo
* The Maven project.
* #parameter expression="${project}"
* #required
* #readonly
private MavenProject project;
and then
and read other properties to determine your jar file packed
maven: guide-java-plugin-development
I agree on creating a custom maven plugin seems like a better option. I dont know about an existing plugin provides solution for what you asked.
md5 checksum (or sha-1 in my repository) is generated with install plugin, so seems like you need to extend this or write a new plugin which works after install phase.
I have 2 suggestions about this plugin:
1) When thinking simple, this plugin should:
Read generated jar:
Extract all entries.
Exclude some entries (e.g. MANIFEST.MF).
Sort remaining entries .
Extract md5s for each in memory.
Generate a single md5 from all of those extracted.
However when considering about where & when independency: Accordig to .class file structure Java_class_file there is minor, major versions entries are held in compiled class files. So if compiler changes, .class files will be changed. In this case we need a check on source code level from this point :( So this solution become useless if there is no guarantee on copiler version.
2) As very dirty but easy solution, this plugin may only extract your module's pom.xml file's md5 code. But you must guarantee each change in your jar reflects to a minor version (or built number) manually.
Instead of writing your own plugin you can write a Groovy script that is executed by groovy-maven-plugin:
import java.util.jar.*
String fileName = '${}/${}.jar'
println "Editing file ${fileName}"
JarFile file = new JarFile(fileName);
// do your edit

Localization in a GWT multi-module project

I have a GWT maven webapp project that used to consist of a single module. As a result of requirements evolution, I need to extract some of the code into separate modules to make them reusable. So far, this process was going well until I decided to extract localization code in order to use it in another project.
What I have is MyAppConstants and MyAppMessages interfaces with corresponding .properties files, which are used in client code by means of GWT.create(). I moved them to separate module, added Localization.gwt.xml file and specified the following inside pom.xml:
<!-- Do not compile source files, just check them -->
<!-- i18n -->
In main application module I simply inherited Localization.gwt.xml. As a result of compilation, I can see that .cache.html files do not contain localized constants and messages (they look like \u0410\u043B...) which they used to have. I suppose this happens because GWT compiler doesn't see source files (f.e., in .generated folder where they normally reside after successful execution of i18n phase of maven plugin. Instead, they can be found in localization.jar.
I feel like I'm missing something because this doesn't seem like a non-trivial task to solve. What would be the proper way of handling such a scenario?
It turns out, in order to have proper localization, you need to have .properties files in classpath at the time of GWT compilation. Initially, I filtered them out of localization.jar because their presence caused GWT compilation failures with messages like this:
Rebind result 'com.myapp.client.MyAppConstants_ru' must be a class
I digged into gwt-dev.jar contents and found out that compiler actually checks presence of localization properties files in classpath to determine bind results.
So my problem was solved by:
removing <goal>i18n</goal> and corresponding configuration in localization module
making sure .properties files make their way to localization.jar
Which makes me wonder, what's the use of i18n goal of gwt-maven-plugin?

