Maven docker cache dependencies

Maven docker cache dependencies - java

I'm trying to use docker to automate maven builds. The project I want to build takes nearly 20 minutes to download all the dependencies, so I tried to build a docker image that would cache these dependencies, but it doesn't seem to save it. My Dockerfile is
FROM maven:alpine
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
ADD pom.xml /usr/src/app
RUN mvn dependency:go-offline
The image builds, and it does download everything. However, the resulting image is the same size as the base maven:alpine image, so it doesn't seem to have cached the dependencies in the image. When I try to use the image to mvn compile it goes through the full 20 minutes of redownloading everything.
Is it possible to build a maven image that caches my dependencies so they don't have to download everytime I use the image to perform a build?
I'm running the following commands:
docker build -t my-maven .
docker run -it --rm --name my-maven-project -v "$PWD":/usr/src/mymaven -w /usr/src/mymaven my-maven mvn compile
My understanding is that whatever RUN does during the docker build process becomes part of the resulting image.

Usually, there's no change in pom.xml file but just some other source code changes when you're attempting to start docker image build. In such circumstance you can do this:
FYI:
FROM maven:3-jdk-8
ENV HOME=/home/usr/app
RUN mkdir -p $HOME
WORKDIR $HOME
# 1. add pom.xml only here
ADD pom.xml $HOME
# 2. start downloading dependencies
RUN ["/usr/local/bin/mvn-entrypoint.sh", "mvn", "verify", "clean", "--fail-never"]
# 3. add all source code and start compiling
ADD . $HOME
RUN ["mvn", "package"]
EXPOSE 8005
CMD ["java", "-jar", "./target/dist.jar"]
So the key is:
add pom.xml file.
then mvn verify --fail-never it, it will download maven dependencies.
add all your source file then, and start your compilation(mvn package).
When there are changes in your pom.xml file or you are running this script for the first time, docker will do 1 -> 2 -> 3. When there are no changes in pom.xml file, docker will skip step 1、2 and do 3 directly.
This simple trick can be used in many other package management circumstances(gradle、yarn、npm、pip).
Edit:
You should also consider using mvn dependency:resolve or mvn dependency:go-offline accordingly as other comments & answers suggest.

Using BuildKit
From Docker v18.03 onwards you can use BuildKit instead of volumes that were mentioned in the other answers. It allows mounting caches that can persist between builds and you can avoid downloading contents of the corresponding .m2/repository every time.
Assuming that the Dockerfile is in the root of your project:
# syntax = docker/dockerfile:1.0-experimental
FROM maven:3.6.0-jdk-11-slim AS build
COPY . /home/build
RUN mkdir /home/.m2
WORKDIR /home/.m2
USER root
RUN --mount=type=cache,target=/root/.m2 mvn -f /home/build/pom.xml clean compile
target=/root/.m2 mounts cache to the specified place in maven image Dockerfile docs.
For building you can run the following command:
DOCKER_BUILDKIT=1 docker build --rm --no-cache .
More info on BuildKit can be found here.

It turns out the image I'm using as a base has a parent image which defines
VOLUME "$USER_HOME_DIR/.m2"
see: https://github.com/carlossg/docker-maven/blob/322d0dff5d0531ccaf47bf49338cb3e294fd66c8/jdk-8/Dockerfile
The result is that during the build, all the files are written to $USER_HOME_DIR/.m2, but because it is expected to be a volume, none of those files are persisted with the container image.
Currently in Docker there isn't any way to unregister that volume definition, so it would be necessary to build a separate maven image, rather than use the official maven image.

I don't think the other answers here are optimal. For example, the mvn verify answer executes the following phases, and does a lot more than just resolving dependencies:
validate - validate the project is correct and all necessary information is available
compile - compile the source code of the project
test - test the compiled source code using a suitable unit testing framework. These tests should not require the code be packaged or deployed
package - take the compiled code and package it in its distributable format, such as a JAR.
verify - run any checks on results of integration tests to ensure quality criteria are met
All of these phases and their associated goals don't need to be ran if you only want to resolve dependencies.
If you only want to resolve dependencies, you can use the dependency:go-offline goal:
FROM maven:3-jdk-12
WORKDIR /tmp/example/
COPY pom.xml .
RUN mvn dependency:go-offline
COPY src/ src/
RUN mvn package

There are two ways to cache maven dependencies:
Execute "mvn verify" as part of a container execution, NOT build, and make sure you mount .m2 from a volume.
This is efficient but it does not play well with cloud build and multiple build slaves
Use a "dependencies cache container", and update it periodically. Here is how:
a. Create a Dockerfile that copies the pom and build offline dependencies:
FROM maven:3.5.3-jdk-8-alpine
WORKDIR /build
COPY pom.xml .
RUN mvn dependency:go-offline
b. Build it periodically (e.g. nightly) as "Deps:latest"
c. Create another Dockerfile to actually build the system per commit (preferably use multi-stage) - and make sure it is FROM Deps.
Using this system you will have fast, reconstruct-able builds with a mostly good-enough cache.

#Kim is closest, but it's not quite there yet. I don't think adding --fail-never is correct, even through it get's the job done.
The verify command causes a lot of plugins to execute which is a problem (for me) - I don't think they should be executing when all I want is to install dependencies! I also have a multi-module build and a javascript sub-build so this further complicates the setup.
But running only verify is not enough, because if you run install in the following commands, there will be more plugins used - which means more dependencies to download - maven refuses to download them otherwise. Relevant read: Maven: Introduction to the Build Lifecycle
You basically have to find what properties disable each plugin and add them one-by-one, so they don't break your build.
WORKDIR /srv
# cache Maven dependencies
ADD cli/pom.xml /srv/cli/
ADD core/pom.xml /srv/core/
ADD parent/pom.xml /srv/parent/
ADD rest-api/pom.xml /srv/rest-api/
ADD web-admin/pom.xml /srv/web-admin/
ADD pom.xml /srv/
RUN mvn -B clean install -DskipTests -Dcheckstyle.skip -Dasciidoctor.skip -Djacoco.skip -Dmaven.gitcommitid.skip -Dspring-boot.repackage.skip -Dmaven.exec.skip=true -Dmaven.install.skip -Dmaven.resources.skip
# cache YARN dependencies
ADD ./web-admin/package.json ./web-admin/yarn.lock /srv/web-admin/
RUN yarn --non-interactive --frozen-lockfile --no-progress --cwd /srv/web-admin install
# build the project
ADD . /srv
RUN mvn -B clean install
but some plugins are not that easily skipped - I'm not a maven expert (so I don't know why it ignores the cli option - it might be a bug), but the following works as expected for org.codehaus.mojo:exec-maven-plugin
<project>
<properties>
<maven.exec.skip>false</maven.exec.skip>
</properties>
<build>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>1.3.2</version>
<executions>
<execution>
<id>yarn install</id>
<goals>
<goal>exec</goal>
</goals>
<phase>initialize</phase>
<configuration>
<executable>yarn</executable>
<arguments>
<argument>install</argument>
</arguments>
<skip>${maven.exec.skip}</skip>
</configuration>
</execution>
<execution>
<id>yarn run build</id>
<goals>
<goal>exec</goal>
</goals>
<phase>compile</phase>
<configuration>
<executable>yarn</executable>
<arguments>
<argument>run</argument>
<argument>build</argument>
</arguments>
<skip>${maven.exec.skip}</skip>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
please notice the explicit <skip>${maven.exec.skip}</skip> - other plugins pick this up from the cli params but not this one (neither -Dmaven.exec.skip=true nor -Dexec.skip=true work by itself)
Hope this helps

Similar with #Kim answer but I use dependency:resolve mvn command. So here's my complete Dockerfile:
FROM maven:3.5.0-jdk-8-alpine
WORKDIR /usr/src/app
# First copy only the pom file. This is the file with less change
COPY ./pom.xml .
# Download the package and make it cached in docker image
RUN mvn -B -f ./pom.xml -s /usr/share/maven/ref/settings-docker.xml dependency:resolve
# Copy the actual code
COPY ./ .
# Then build the code
RUN mvn -B -f ./pom.xml -s /usr/share/maven/ref/settings-docker.xml package
# The rest is same as usual
EXPOSE 8888
CMD ["java", "-jar", "./target/YOUR-APP.jar"]

After a few days of struggling, I managed to do this caching later using intermediate contrainer, and I'd like to summarize my findings here as this topic is so useful and being frequently shown in Google search frontpage:
Kim's answer is only working to a certain condition: pom.xml cannot be changed, plus Maven do a regular update daily basis by default
mvn dependency:go-offline -B --fail-never has a similar drawback, so if you need to pull fresh code from repo, high chances are Maven will trigger a full checkout every time
Mount volume is not working as well because we need to resolve the dependencies during image being built
Finally, I have a workable solution combined(May be not working to others):
Build an image to resolve all the dependencies first(Not intermediate image)
Create another Dockerfile with intermediate image, sample dockerfiles like this:
#docker build -t dependencies .
From ubuntu
COPY pom.xml pom.xml
RUN mvn dependency:go-offline -B --fail-never
From dependencies as intermediate
From tomcat
RUN git pull repo.git (whatsoever)
RUN mvn package
The idea is to keep all the dependencies in a different image that Maven can use immediately
It could be other scenarios I haven't encountered yet, but this solution relief me a bit from download 3GB rubbish every time
I cannot imagine why Java became such a fat whale in today's lean world

I had to deal with the same issue.
Unfortunately, as just said by another contributor, dependency:go-offline and the other goals, don't fully solve the problem: many dependencies are not downloaded.
I found a working solution as follow.
# Cache dependencies
ADD settings.xml .
ADD pom.xml .
RUN mvn -B -s settings.xml -Ddocker.build.skip=true package test
# Build artifact
ADD src .
RUN mvn -B -s settings.xml -DskipTests package
The trick is to do a full build without sources, which produces a full dependency scan.
In order to avoid errors on some plugins (for example: OpenAPI maven generator plugin or Spring Boot maven plugin) I had to skip its goals, but letting it to download all the dependencies by adding for each one a configuration settings like follow:
<configuration>
<skip>${docker.build.skip}</skip>
</configuration>
Regards.

I think the general game plan presented among the other answers is the right idea:
Copy pom.xml
Get dependencies
Copy source
Build
However, exactly how you do step #2 is the real key. For me, using the same command I used for building to fetch dependencies was the right solution:
FROM java/java:latest
# Work dir
WORKDIR /app
RUN mkdir -p .
# Copy pom and get dependencies
COPY pom.xml pom.xml
RUN mvn -Dmaven.repo.local=./.m2 install assembly:single
# Copy and build source
COPY . .
RUN mvn -Dmaven.repo.local=./.m2 install assembly:single
Any other command used to fetch dependencies resulted in many things needing to be download during the build step. It makes sense the running the exact command you plan on running will you get you the closest to everything you need to actually run that command.

I had this issue just a litle while ago. The are many solutions on the web, but the one that worked for me is simply mount a volume for the maven modules directory:
mkdir /opt/myvolumes/m2
then in the Dockerfile:
...
VOLUME /opt/myvolumes/m2:/root/.m2
...
There are better solutions, but not as straightforward.
This blog post goes the extra mile in helping you to cache everything:
https://keyholesoftware.com/2015/01/05/caching-for-maven-docker-builds/

A local Nexus 3 Image running in Docker and acting as a local Proxy is an acceptable solution:
The idea is similar to Dockerize an apt-cacher-ng service apt-cacher-ng
here you can find a comprehensive step by step. github repo
Its really fast.

Another Solution would be using a repository manger such as Sonar Nexus or Artifactory. You can set a maven proxy inside the registry then use the registry as your source of maven repositories.

Here my working solution.
The tricks are:
use docker multi-stage build
don't copy the project source in the image created in the first stage, but only pom (or poms in case your project is multi-module)
Here my solution for a multi-module project using openjdk11
## stage 1
FROM adoptopenjdk/maven-openjdk11 as dependencies
ENV HOME=/usr/maven
ENV MVN_REPO=/usr/maven/.m3/repository
RUN mkdir -p $HOME
RUN mkdir -p $MVN_REPO
WORKDIR $HOME
## copy all pom files of the modules tree with the same directory tree of the project
#reactor
ADD pom.xml $HOME
## api module
RUN mkdir -p $HOME/api
ADD api/pom.xml $HOME/api
## application module
RUN mkdir -p $HOME/application
ADD application/pom.xml $HOME/application
## domain module
RUN mkdir -p $HOME/domain
ADD domain/pom.xml $HOME/domain
## service module
RUN mkdir -p $HOME/service
ADD service/pom.xml $HOME/service
## download all dependencies in this docker image. The goal "test" is needed to avoid download of dependencies with <scope>test</scope> in the second stage
RUN mvn -Dmaven.repo.local=$MVN_REPO dependency:go-offline test
## stage 2
FROM adoptopenjdk/maven-openjdk11 as executable
ENV APP_HOME=/usr/app
ENV MVN_REPO=/usr/maven/.m3/repository
ENV APP_MVN_REPO=$MVN_REPO
RUN mkdir -p $APP_HOME
RUN mkdir -p $APP_MVN_REPO
WORKDIR $APP_HOME
ADD . $APP_HOME
## copy the dependecies tree from "stage 1" dependencies image to this image
COPY --from=dependencies $MVN_REPO $APP_MVN_REPO
## package the application, skipping test
RUN mvn -Dmaven.repo.local=$APP_MVN_REPO package -DskipTests
## set ENV values
ENV NAME=VALUE
## copy the jar in the WORKDIR folder
RUN cp $APP_HOME/application/target/*.jar $APP_HOME/my-final-jar-0.0.1-SNAPSHOT.jar
EXPOSE 8080
ENTRYPOINT ["java", "-jar","/usr/app/my-final-jar-0.0.1-SNAPSHOT.jar" ,"--spring.profiles.active=docker"]

This one did the trick very well for me:
edit config.toml
[runner.docker]
...
volumes = ["/cache","m2:/root/.m2"]
...
it will create "m2" volume that will persists across builds and you guys knows the rest :)

If the dependencies are downloaded after the container is already up, then you need to commit the changes on this container and create a new image with the downloaded artifacts.

Related

Installing maven depencies from pom.xml in docker

I am trying to run a spring-boot maven project inside a docker environment. So the setup is as follows:
Docker is set up and installs Java, etc. (done only once)
App is run (can be any number of times)
What I am experiencing
Every time I run the spring-boot project by mvn spring-boot:run, it installs all the required libraries (every time I run the project) from the pom.xml (Java, Maven, etc. are preinstalled from the docker) and then runs the project.
What I am trying to do
This process of reinstalling every time is redundant and time-consuming, so I want to delegate this installation thing to the docker as well. Ideally, using the pom.xml to do the installations, though alternative ways are also welcome.
What I have tried so far
Install npm using a good tutorial, but it fails in Docker as we can't restart the terminal during docker build, while source ~/.bash_profile doesn't seem to work either.
Tried to build that project directly in docker (by RUN mvn clean install --fail-never) and copying both npm and node folders to the directory where I run the app. But it doesn't seem to work either as it's installing them every time without any change.
Can anyone please help me there? This problem has stuck the project. Thanks a lot!

From your question I understand that, in the Dockerfile you just install java, maven, etc. but does not build your project using mvn clean package install before executing mvn spring-boot:run (and that is redundant as well because mvn spring-boot:run does the build for you before staring the application).
You cannot skip installing maven dependency while running on containers as they are spun as they run. So it will be installed either while you call mvn clean install or mvn spring-boot:run.
What the max you can do is, using your devops pipeline, build the jar previously and in the Dockerfile just copy the build jar and execute.
Example Dockerfile in this case:
FROM openjdk:8-jdk-alpine
ARG JAR_FILE=target/*.jar
COPY ${JAR_FILE} app.jar
ENTRYPOINT ["java","-jar","/app.jar"]
Here the previously build artifact is already available at target/

Slow gradle build in Docker. Caching gradle build

I am doing university project where we need to run multiple Spring Boot applications at once.
I had already configured multi-stage build with gradle docker image and then run app in openjdk:jre image.
Here is my Dockerfile:
FROM gradle:5.3.0-jdk11-slim as builder
USER root
WORKDIR /usr/src/java-code
COPY . /usr/src/java-code/
RUN gradle bootJar
FROM openjdk:11-jre-slim
EXPOSE 8080
WORKDIR /usr/src/java-app
COPY --from=builder /usr/src/java-code/build/libs/*.jar ./app.jar
ENTRYPOINT ["java", "-jar", "app.jar"]
I am building and running everything with docker-compose. Part of docker-compose:
website_server:
build: website-server
image: website-server:latest
container_name: "website-server"
ports:
- "81:8080"
Of course first build take ages. Docker is pulling all it's dependencies. And I am okay with that.
Everything is working ok for now but every little change in code causes around 1 min build time for one app.
Part of build log: docker-compose up --build
Step 1/10 : FROM gradle:5.3.0-jdk11-slim as builder
---> 668e92a5b906
Step 2/10 : USER root
---> Using cache
---> dac9a962d8b6
Step 3/10 : WORKDIR /usr/src/java-code
---> Using cache
---> e3f4528347f1
Step 4/10 : COPY . /usr/src/java-code/
---> Using cache
---> 52b136a280a2
Step 5/10 : RUN gradle bootJar
---> Running in 88a5ac812ac8
Welcome to Gradle 5.3!
Here are the highlights of this release:
- Feature variants AKA "optional dependencies"
- Type-safe accessors in Kotlin precompiled script plugins
- Gradle Module Metadata 1.0
For more details see https://docs.gradle.org/5.3/release-notes.html
Starting a Gradle Daemon (subsequent builds will be faster)
> Task :compileJava
> Task :processResources
> Task :classes
> Task :bootJar
BUILD SUCCESSFUL in 48s
3 actionable tasks: 3 executed
Removing intermediate container 88a5ac812ac8
---> 4f9beba838ed
Step 6/10 : FROM openjdk:11-jre-slim
---> 0e452dba629c
Step 7/10 : EXPOSE 8080
---> Using cache
---> d5519e55d690
Step 8/10 : WORKDIR /usr/src/java-app
---> Using cache
---> 196f1321db2c
Step 9/10 : COPY --from=builder /usr/src/java-code/build/libs/*.jar ./app.jar
---> d101eefa2487
Step 10/10 : ENTRYPOINT ["java", "-jar", "app.jar"]
---> Running in ad02f0497c8f
Removing intermediate container ad02f0497c8f
---> 0c63eeef8c8e
Successfully built 0c63eeef8c8e
Successfully tagged website-server:latest
Every time it freezes after Starting a Gradle Daemon (subsequent builds will be faster)
I was thinking about adding volume with cached gradle dependencies but I don't know if that is core of the problem. Also i could't find good examples for that.
Is there any way to speed up the build?

Build takes a lot of time because Gradle every time the Docker image is built downloads all the plugins and dependencies.
There is no way to mount a volume at the image build time. But it is possible to introduce new stage that will download all dependencies and will be cached as Docker image layer.
FROM gradle:5.6.4-jdk11 as cache
RUN mkdir -p /home/gradle/cache_home
ENV GRADLE_USER_HOME /home/gradle/cache_home
COPY build.gradle /home/gradle/java-code/
WORKDIR /home/gradle/java-code
RUN gradle clean build -i --stacktrace
FROM gradle:5.6.4-jdk11 as builder
COPY --from=cache /home/gradle/cache_home /home/gradle/.gradle
COPY . /usr/src/java-code/
WORKDIR /usr/src/java-code
RUN gradle bootJar -i --stacktrace
FROM openjdk:11-jre-slim
EXPOSE 8080
USER root
WORKDIR /usr/src/java-app
COPY --from=builder /usr/src/java-code/build/libs/*.jar ./app.jar
ENTRYPOINT ["java", "-jar", "app.jar"]
Gradle plugin and dependency cache is located in $GRADLE_USER_HOME/caches. GRADLE_USER_HOME must be set to something different than /home/gradle/.gradle. /home/gradle/.gradle in parent Gradle Docker image is defined as volume and is erased after each image layer.
In the sample code GRADLE_USER_HOME is set to /home/gradle/cache_home.
In the builder stage Gradle cache is copied to avoid downloading the dependencies again: COPY --from=cache /home/gradle/cache_home /home/gradle/.gradle.
The stage cache will be rebuilt only when build.gradle is changed.
When Java classes are changes, cached image layer with all dependencies is reused.
This modifications can reduce the build time but more clean way of building Docker images with Java applications is Jib by Google.
There is a Jib Gradle plugin that allows to build container images for Java applications without manually creating Dockerfile.
Building image with application and running the container is similar to:
gradle clean build jib
docker-compose up

Docker caches its images in "layers." Each command that you run is a layer. Each change that is detected in a given layer invalidates the layers that come after it. If the cache is invalidated, then the invalidated layers must be built from scratch, including dependencies.
I would suggest splitting your build steps. Have a previous layer which only copies the dependency specification into the image, then runs a command which will result in Gradle downloading the dependencies. After that's complete, copy your source into the same location where you just did that, and run the real build.
This way, the previous layers will be invalidated only when the gradle files change.
I haven't done this with Java/Gradle, but I have followed the same pattern with a Rust project, guided by this blog post.

You can try and use BuildKit (now activated by default in the latest docker-compose 1.25)
See "Speed up your java application Docker images build with BuildKit!" from
Aboullaite Med.
(This was for maven, but the same idea applies to gradle)
let's consider the following Dockerfile:
FROM maven:3.6.1-jdk-11-slim AS build
USER MYUSER
RUN mvn clean package
Modifying the second line always invalidate maven cache due to false dependency, which exposes inefficient caching issue.
BuildKit solves this limitation by introducing the concurrent build graph solver, which can run build steps in parallel and optimize out commands that don’t have an impact on the final result.
Additionally, Buildkit tracks only the updates made to the files between repeated build invocations that optimize the access to the local source files. Thus, there is no need to wait for local files to be read or uploaded before the work can begin.

As the other answers have mentioned, docker caches each step in a layer. If you could somehow get only the downloaded dependencies into a layer, then it would not have to be re downloaded each time, assuming the dependencies haven't changed.
Unfortunately, gradle doesn't have a built-in task to do this. But you can still work around it. Here's what I did:
# Only copy dependency-related files
COPY build.gradle gradle.properties settings.gradle /app/
# Only download dependencies
# Eat the expected build failure since no source code has been copied yet
RUN gradle clean build --no-daemon > /dev/null 2>&1 || true
# Copy all files
COPY ./ /app/
# Do the actual build
RUN gradle clean build --no-daemon
Also, make sure your .dockerignore file has at least these items, so that they're not sent in the docker build context when the image is built:
.gradle/
bin/
build/
gradle/

Just as an addition to other people answers, if your internet connection is slow, as it downloads dependencies every single time, you might want to set up sonatype nexus, in order to keep the dependencies already downloaded.

I used a slightly different idea. I scheduled a nightly build on my Jenkins building the entire Gradle project:
docker build -f Dockerfile.cache --tag=gradle-cache:latest .
# GRADLE BUILD CACHE
FROM gradle:6.7.1-jdk11
COPY build.gradle.kts /home/gradle/code/
COPY settings.gradle.kts /home/gradle/code/
COPY gradle.properties /home/gradle/code/
COPY ./src /home/gradle/code/src
WORKDIR /home/gradle/code
RUN gradle bootJar -i -s
Then I start my builds from this "cache image" so I can leverage all the Gradle goodness:
docker build --tag=my-app:$version .
# GRADLE BUILD
FROM gradle-cache:latest as gradle
COPY build.gradle.kts /home/gradle/code/
COPY settings.gradle.kts /home/gradle/code/
COPY gradle.properties /home/gradle/code/
RUN rm -rf /home/gradle/code/src
COPY ./src /home/gradle/code/src
WORKDIR /home/gradle/code
RUN gradle bootJar -i -s
# SPRING BOOT
FROM openjdk:11.0.9.1-jre
COPY --from=gradle /home/gradle/code/build/libs/app.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java", "-Xmx2G", "-Djava.security.egd=file:/dev/./urandom", "-jar", "app.jar"]
Remember about pruning unused images every week or so.

I don't know much about docker internals, but I think that the problem is that each new docker build command, will copy all files and build them (if it detects changes in at least one file).
Then this will most likely change several jars and the second steps needs to run too.
My suggestion is to build on the terminal (outside of docker) and only docker build the app image.
This can even be automated with a gradle plugin:
https://github.com/Transmode/gradle-docker (one example, I did not search thoroughly)

How to cache maven dependencies in Docker

I'm working on a project with ~200MB dependencies and i'd like to avoid useless uploads due to my limited bandwidth.
When I push my Dockerfile (i'll attach it in a moment), I always have a ~200MB upload even if I didn't touch the pom.xml:
FROM maven:3.6.0-jdk-8-slim
WORKDIR /app
ADD pom.xml /app
RUN mvn verify clean --fail-never
COPY ./src /app/src
RUN mvn package
ENV CONFIG_FOLDER=/app/config
ENV DATA_FOLDER=/app/data
ENV GOLDENS_FOLDER=/app/goldens
ENV DEBUG_FOLDER=/app/debug
WORKDIR target
CMD ["java","-jar","-Dlogs=/app/logs", "myProject.jar"]
This Dockerfile should make a 200MB fatJAR including all the dependencies, that's why the ~200MB upload that occurs everytime. What i would like to achieve is building a Layer with all the dependencies and "tell" to the packaging phase to not include the dependencies JARs into the fatJAR but to search for them inside a given directory.
I was wondering to build a script that executes mvn dependency:copy-dependencies before the building process and then copying the directory to the container; then building a "non-fat"JAR that has all those dependencies only linked and not actually copied into it.
Is this possible?
EDIT:
I discovered that the Maven Local Repository of the container is located under /root/.m2. So I ended making a very simple script like this:
BuildDocker.sh
mvn verify -clean --fail-never
mv ~/.m2 ~/git/myProjectRepo/.m2
sudo docker build -t myName/myProject:"$1"
And edited Dockerfile like:
# Use an official Python runtime as a parent image
FROM maven:3.6.0-jdk-8-slim
# Copy my Mavne Local Repository into the container thus creating a new layer
COPY ./.m2 /root/.m2
# Set the working directory to /app
WORKDIR /app
# Copy the pom.xml
ADD pom.xml /app
# Resolve and Download all dependencies: this will be done only if the pom.xml has any changes
RUN mvn verify clean --fail-never
# Copy source code and configs
COPY ./src /app/src
# create a ThinJAR
RUN mvn package
# Run the jar
...
After the building process i stated that /root/.m2 has all the directories I but as soon as i launch the JAR i get:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/log4j/Priority
at myProject.ThreeMeans.calculate(ThreeMeans.java:17)
at myProject.ClusteringStartup.main(ClusteringStartup.java:7)
Caused by: java.lang.ClassNotFoundException: org.apache.log4j.Priority
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 2 more
Maybe i shouldn't run it through java -jar?

If I understand correctly what you'd like to achieve, the problem is to avoid creating a fat jar with all Maven dependencies at each Docker build (to alleviate the size of the Docker layers to be pushed after a rebuild).
If yes, you may be interested in the Spring Boot Thin Launcher, which is also applicable for non-Spring-Boot projects. Some comprehensive documentation is available in the README.md of the corresponding GitHub repo:
https://github.com/dsyer/spring-boot-thin-launcher#readme
To sum up, it should suffice to add the following plugin declaration in your pom.xml:
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<!--<version>${spring-boot.version}</version>-->
<dependencies>
<dependency>
<groupId>org.springframework.boot.experimental</groupId>
<artifactId>spring-boot-thin-layout</artifactId>
<version>1.0.19.RELEASE</version>
</dependency>
</dependencies>
</plugin>
</plugins>
</build>
Ideally, this solution should be combined with a standard Dockerfile setup to benefit from Docker's cache (see below for a typical example).
Leverage Docker's cache mechanism for a Java/Maven project
The archetype of a Dockerfile that avoids re-downloading all Maven dependencies at each build if only source code files (src/*) have been touched is given in the following reference:
https://whitfin.io/speeding-up-maven-docker-builds/
To be more precise, the proposed Dockerfile is as follows:
# our base build image
FROM maven:3.5-jdk-8 as maven
WORKDIR /app
# copy the Project Object Model file
COPY ./pom.xml ./pom.xml
# fetch all dependencies
RUN mvn dependency:go-offline -B
# copy your other files
COPY ./src ./src
# build for release
# NOTE: my-project-* should be replaced with the proper prefix
RUN mvn package && cp target/my-project-*.jar app.jar
# smaller, final base image
FROM openjdk:8u171-jre-alpine
# OPTIONAL: copy dependencies so the thin jar won't need to re-download them
# COPY --from=maven /root/.m2 /root/.m2
# set deployment directory
WORKDIR /app
# copy over the built artifact from the maven image
COPY --from=maven /app/app.jar ./app.jar
# set the startup command to run your binary
CMD ["java", "-jar", "/app/app.jar"]
Note that it relies on the so-called multi-stage build feature of Docker (presence of two FROM directives), implying the final image will be much smaller than the maven base image itself.
(If you are not interested in that feature during the development phase, you can remove the lines FROM openjdk:8u171-jre-alpine and COPY --from=maven /app/app.jar ./app.jar.)
In this approach, the Maven dependencies are fetched with RUN mvn dependency:go-offline -B before the line COPY ./src ./src (to benefit from Docker's cache).
Note however that the dependency:go-offline standard goal is not "perfect" as a few dynamic dependencies/plugins may still trigger some re-downloading at the mvn package step.
If this is an issue for you (e.g. if at some point you'd really want to work offline), you could take at look at that other SO answer that suggests using a dedicated plugin that provides the de.qaware.maven:go-offline-maven-plugin:resolve-dependencies goal.

In general Dockerfile container build, works in layers and each time you build these layers are available in catch and is used if there are no changes.
Ideally it should have worked same way.
Maven generally looks for dependencies by default in .m2 folder located in Home dir of User in Ubuntu /home/username/
If dependent jars are not available then it downloads those jars to .m2 and uses it.
Now you can zip and copy this .m2 folder after 1 successful build and move it inside Docker Container User's Home directory.
Do this before you run build command
Note: You might need to replace existing .m2 folder in docker
So your Docker file would be something like this
FROM maven:3.6.0-jdk-8-slim
WORKDIR /app
COPY .m2.zip /home/testuser/
ADD pom.xml /app
RUN mvn verify clean --fail-never
COPY ./src /app/src
RUN mvn package
...

The documentation of the official Maven Docker images also points out different ways to achieve better caching of dependencies.
Basically, they recommend to either mount the local maven repository as a volume and use it across Docker images or use a special local repository (/usr/share/maven/ref/) the contents of which will be copied on container startup.

What is the purpose of mvnw and mvnw.cmd files?

When I created a Spring Boot application I could see mvnw and mvnw.cmd files in the root of the project. What is the purpose of these two files?

These files are from Maven wrapper. It works similarly to the Gradle wrapper.
This allows you to run the Maven project without having Maven installed and present on the path. It downloads the correct Maven version if it's not found (as far as I know by default in your user home directory).
The mvnw file is for Linux (bash) and the mvnw.cmd is for the Windows environment.
To create or update all necessary Maven Wrapper files execute the following command:
mvn -N io.takari:maven:wrapper
To use a different version of maven you can specify the version as follows:
mvn -N io.takari:maven:wrapper -Dmaven=3.3.3
Both commands require maven on PATH (add the path to maven bin to Path on System Variables) if you already have mvnw in your project you can use ./mvnw instead of mvn in the commands.

Command mvnw uses Maven that is by default downloaded to ~/.m2/wrapper on the first use.
URL with Maven is specified in each project at .mvn/wrapper/maven-wrapper.properties:
distributionUrl=https://repo1.maven.org/maven2/org/apache/maven/apache-maven/3.3.9/apache-maven-3.3.9-bin.zip
To update or change Maven version invoke the following (remember about --non-recursive for multi-module projects):
./mvnw io.takari:maven:wrapper -Dmaven=3.3.9
or just modify .mvn/wrapper/maven-wrapper.properties manually.
To generate wrapper from scratch using Maven (you need to have it already in PATH run:
mvn io.takari:maven:wrapper -Dmaven=3.3.9

The Maven Wrapper is an excellent choice for projects that need a specific version of Maven (or for users that don't want to install Maven at all). Instead of installing many versions of it in the operating system, we can just use the project-specific wrapper script.
mvnw: it's an executable Unix shell script used in place of a fully installed Maven
mvnw.cmd: it's for Windows environment
Use Cases
The wrapper should work with different operating systems such as:
Linux
OSX
Windows
Solaris
After that, we can run our goals like this for the Unix system:
./mvnw clean install
And the following command for Batch:
./mvnw.cmd clean install
If we don't have the specified Maven in the wrapper properties, it'll be downloaded and installed in the folder $USER_HOME/.m2/wrapper/dists of the system.
Maven Wrapper plugin
Maven Wrapper plugin to make auto installation in a simple Spring Boot project.
First, we need to go in the main folder of the project and run this command:
mvn -N io.takari:maven:wrapper
We can also specify the version of Maven:
mvn -N io.takari:maven:wrapper -Dmaven=3.5.2
The option -N means –non-recursive so that the wrapper will only be applied to the main project of the current directory, not in any submodules.
Source 1 (further reading): https://www.baeldung.com/maven-wrapper

short answer: to run Maven and Gradle in the terminal without following manual installation processes.
Gradle example:
./gradlew clean build
./gradlew bootRun
Maven example:
./mvnw clean install
./mvnw spring-boot:run
"The recommended way to execute any Gradle build is with the help of the Gradle Wrapper (in short just “Wrapper”). The Wrapper is a script that invokes a declared version of Gradle, downloading it beforehand if necessary. As a result, developers can get up and running with a Gradle project quickly without having to follow manual installation processes saving your company time and money."
Gradle would also add some specific files corresponding to the Maven files Gradlew and Gradle.bat

In the windows OS, mvnw clean install is used for the maven clean and install activity, and mvnw spring-boot:run is used to run the Spring boot application from Command Prompt.
For Eaxmple:
C:\SamplesSpringBoot\demo>mvnw clean install
C:\SamplesSpringBoot\demo>mvnw spring-boot:run

By far the best option nowadays would be using a maven container as a builder tool. A mvn.sh script like this would be enough:
#!/bin/bash
docker run --rm -ti \
-v $(pwd):/opt/app \
-w /opt/app \
-e TERM=xterm \
-v $HOME/.m2:/root/.m2 \
maven mvn "$#"

Running main classes from a deployed artifact with maven

I don't get it. I've set up my pom.xml to use the Maven exec plugin so I can execute some of the classes in my project with the correct classpath, -D defines and -javaagent. So from a shell with the classes built in ./target/classes etc.. I can run the main() methods using
mvn exec:java -Dexec:mainClass=classWithAMainMethod
All good so far.
Now I want to ship my project(a jar artifact) and I still want to be able to use the configuration I've put in the pom.xml for running the classes with the correct arguments etc.. How do I do it? Is there some way of staying
mvn -artifactJar=MyArtifact.jar exec:java -Dexec:mainClass=classWithAMainMethod
when all I have is MyArtifact.jar(Or a maven repository with MyArtifact.jar in it)??
I've tried the following:
Get the jar with the dependency:get goal and unzip it. I can't do anything with it
as the pom.xml seems to end up in META-INF/maven in the artifact jar. Is there any way of using it?
Creating a dummy pom where I want to run my project with a single dependency on my projects artifact. I can then use exec:java to run the main classes but it's dosen't uses the configuration from my projects pom.
Thanks.

The AppAssembler plugin worked out quite well for me. I replaced the exec plugin config in my project's pom with something like this in the build section:
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>appassembler-maven-plugin</artifactId>
<version>1.2.2</version>
<configuration>
<repositoryLayout>flat</repositoryLayout>
<repositoryName>lib</repositoryName>
<extraJvmArguments>
-Djava.rmi.server.hostname=localhost
-javaagent:${spring.javaagent.jar}
</extraJvmArguments>
<programs>
<program>
<name>foo1</name>
<mainClass>net.foor.FooMain</mainClass>
</program>
...
</configuration>
</plugin>
In Eclipse I created an external tools launcher to run the resulting scripts from target/appassembler/bin
On the machine I wanted to deploy to(Assuming access to the internal Maven repository where my artifact+dependencies have been installed/deployed):
First use wget or mvn dependency:get to get a copy of my artifact jar.
Extract the pom. unzip -j artifact.jar */pom.xml*
Run mvn appassembler:assemble -DassembleDirectory=.
Move the artifact.jar into the ./lib directory
Set execute permissions on generated shell scripts in ./bin

Have you tried using something like onejar?
That sounds like what you're looking for.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.