Update dockerfile image without downloading dependencies

Update dockerfile image without downloading dependencies - java

I am new to docker. I use an image debian to host a J2EE application.
FROM debian
WORKDIR /app
ADD . /app
RUN apt-get update && apt-get --assume-yes install \
default-jre \
default-jdk \
maven
RUN mvn clean install
CMD ["mvn", "ninja:run"]
I build my image by doing this:
docker build . -t rssaggregator
Let's suppose I add a new dependency to download. How can I update and build the image without downloading the dependencies again?
Thanks for your help!

If you add the dependencies in separate RUN statements after your primary installs (and don't change any of the Dockerfile above that), Docker will used cached layers and won't download the unmodified layers unless you specify --no-cache in your build (which you may want to do at some point if you want to update your primary installs/layers).
You also may want to specify the version of the JRE and JDK installs so you know which it's using and then you can change those versions to make updating easier.
See dockerfile_best-practices. You may also want to try a multi-stage build for a more advanced approach on building on a base image.

Related

To include maven in dockerfile or not?

I have this working simple dockerfile.
FROM openjdk:8-jdk-alpine
WORKDIR /data
COPY target/*.jar, myapp.jar
ENTRYPOINT ["java","-jar",myapp.jar]
I build my jar using maven either locally or in a pipeline then use that .jar here. I've seen many examples installing maven in the dockerfile instead of doing the build before. Doesnt that just make the image larger? Is there a benefit of doing that?

Usually I have a CICD server which I use for building my jar file and then I generate a docker image using it. Build a jar consumes resources and doing it when you're running your docker container can take longer depending on your configuration. In a normal CICD strategy, build and deploy are different steps. I also believe your docker image should be as lean as possible.
That's my opinion.
I hope I could help you somehow.

I think you are looking for Multi-stage builds.
Example of multistage Dockerfile:
# syntax=docker/dockerfile:1
FROM golang:1.16
WORKDIR /go/src/github.com/alexellis/href-counter/
RUN go get -d -v golang.org/x/net/html
COPY app.go ./
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=0 /go/src/github.com/alexellis/href-counter/app ./
CMD ["./app"]
Notice the COPY --from=0 ... line, it's copying the result of the build that happens in the first container to the second.
These mutistage builds are good idea for builds that need to install their own tools in specific versions.
Example taken from https://docs.docker.com/develop/develop-images/multistage-build/

Run gauge tests inside a docker container

I'm trying to Dockerize a Gauge test automation project so I can run specs inside a Docker container. The project is written in Java and Spring Boot.
I saw this tutorial in Gauge documentation.
This is the DockerFile in the tutorial:
FROM ubuntu
# Install Java.
RUN apt-get update && apt-get install -q -y \
openjdk-8-jdk \
apt-transport-https \
gnupg2 \
ca-certificates
# Install gauge
RUN apt-key adv --keyserver hkp://ipv4.pool.sks-keyservers.net --recv-keys 023EDB0B && \
echo deb https://dl.bintray.com/gauge/gauge-deb stable main | tee -a /etc/apt/sources.list
RUN apt-get update && apt-get install gauge
# Install gauge plugins
RUN gauge install java && \
gauge install screenshot
ENV PATH=$HOME/.gauge:$PATH
As you see, there's no "ADD"/"COPY" there in the DokcerFile.
Is it just suggesting an alternative to install Gauge and the other packages on the host?
Any ideas on how to run the specs inside a Docker container?

Here is what I did to get the test running in the docker container.
I have a specs folder beside src in my project structure meaning the gauge tests will run using the JAR file but they're not part of the JAR file themselves.
--MyProject
----specs
----src
...
I used maven to run the test inside the container. That's why I preferred to build the project inside the container so I get the JAR file ready with the same version of maven I run the test with.
Here is the DockerFile. I developed a bash script to run the test. You may run the script with CMD or ENTRYPOINT:
FROM maven:3.6.1-jdk-8
# add any project resources needed
ADD env /home/e2e/env
ADD specs /home/e2e/specs
ADD src /home/e2e/src
ADD src/main/scripts/entrypoint.sh /home/e2e/
ADD pom.xml /home/e2e/
RUN ["chmod", "+x", "./home/e2e/entrypoint.sh"]
# Install Gauge, web browser and webdriver in your preferred way...
ENV PATH=$HOME/.gauge:$PATH
# I'm keeping the cntainer running. But it's all up to you.
CMD /home/e2e/entrypoint.sh && tail -f /dev/null
And then here is the simple entrypoint.sh script:
#!/bin/bash
cd /home/e2e/
mvn clean package
gauge --version
google-chrome --version
mvn -version
mvn gauge:execute -DspecsDir=specs/myTest.spec
Of course, you could just use a ready JAR instead of building it inside the container. Or you could build the JAR while creating the docker image.

How to reduce my java/gradle docker image size?

I have a Docker file like the following:
FROM openjdk:8
ADD . /usr/share/app-name-tmp
WORKDIR /usr/share/app-name-tmp
RUN ./gradlew build \
mv ./build/libs/app-name*.jar /usr/share/app-name/app-name.jar
WORKDIR /usr/share/app-name
RUN rm -rf /usr/share/app-name-tmp
EXPOSE 8080
RUN chmod +x ./docker-entry.sh
ENTRYPOINT [ "./docker-entry.sh" ]
The problem is that the final image size is 1.1GB, I know it happens because gradle downloads and stores all the dependencies. What is the best way to remove those unnecessary files and just keep the jar?

I am really confused about your image size. I have typical Spring Boot applications offering a REST service including an embedded servlet container in less than 200MB! It looks like your project dependencies can and should be optimised.
Docker Image
The openjdk:8 (243MB compressed) can be replaced by one with a reduced Alpine unix image like openjdk:8-jdk-alpine (52MB) as a base image but if you don't need compiler capabilities (e.g. don't use JSPs) you may also go for openjdk:8-jre-alpine (42MB) which includes the runtime only, have a look into Docker Hub. I use that for Spring Boot based REST services working great.
Java Dependencies
The Java dependencies needed for compile and runtime have to be included but you may have unused dependencies included:
check your dependencies, are the current compile/runtime dependencies really used or maybe can be removed or moved to test, see Gradle Java Plugin
some dependencies have a lot of transitive dependencies (display using gradle dependencies), check out for unnecessary ones and exclude them if unused, see Gradle Dependency Management. Be sure to do integration tests before applying finally, some transitive dependencies are not well documented but may be essential!

With Docker 17.05+ you can use multi-stage builds.
"With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don’t want in the final image."
So your Dockerfile could look like this:
#
# first stage (build)
#
FROM openjdk:8 as build
ADD . /usr/share/app-name-tmp
WORKDIR /usr/share/app-name-tmp
RUN ./gradlew build && \
mv ./build/libs/app-name*.jar /usr/share/app-name/app-name.jar
#
# second stage. use alpine to reduce the image size
#
FROM openjdk:8-jre-alpine
WORKDIR /usr/share/app-name
COPY --from=build /usr/share/app-name/app-name.jar .
EXPOSE 8080
RUN chmod +x ./docker-entry.sh
ENTRYPOINT [ "./docker-entry.sh" ]
This way you only keep the jar and all the unnecessary files are not included in the final image.

Each RUN instruction creates a new layer on top of the existing file system. So the new layer after RUN instruction that deletes you app-name-tmp directory just masks the previous layer containing the downloaded libraries. Hence your docker image still has that size from all the layers built.
Remove the separate RUN rm -rf /usr/share/app-name-tmp instruction and include it in the same RUN instruction that does gradle build as shown below.
RUN ./gradlew build \
mv ./build/libs/app-name*.jar /usr/share/app-name/app-name.jar \
rm -rf /usr/share/app-name-tmp/*
So, your final Dockerfile would be
FROM openjdk:8
ADD . /usr/share/app-name-tmp
WORKDIR /usr/share/app-name-tmp
RUN ./gradlew build \
mv ./build/libs/app-name*.jar /usr/share/app-name/app-name.jar \
rm -rf /usr/share/app-name-tmp/*
WORKDIR /usr/share/app-name
EXPOSE 8080
RUN chmod +x ./docker-entry.sh
ENTRYPOINT [ "./docker-entry.sh" ]
The image built will still add up size from the directory /usr/share/app-name-tmp.

It seems your image comes from
FROM openjdk:8
so from
https://github.com/docker-library/openjdk/blob/e6e9cf8b21516ba764189916d35be57486203c95/8-jdk/Dockerfile
and in fact a Debian
FROM buildpack-deps:jessie-scm
you should try to use an Alpine base
https://github.com/docker-library/openjdk/blob/9a0822673dffd3e5ba66f18a8547aa60faed6d08/8-jdk/alpine/Dockerfile
I guess your image will be at least half the size

Is this the container you deploy to production? If so, don't use it for the actual build. Do the build (and the testing) elsewhere and once it is blessed, copy just the JAR to your Docker production container.

For OpenJDK-12
My application is written in Kotlin along with spring boot and maven.
I had same issue with openJDK-12 and OracleOpenJDK-12 size is 470 MB.
I wanted to reduce my container size so i selected adoptopenjdk/openjdk12:x86_64-alpine-jre-12.33 and achieved 189 MB as shown below.
FROM adoptopenjdk/openjdk12:x86_64-alpine-jre-12.33
RUN mkdir /app
COPY ./target/application-SNAPSHOT.jar /app/application-SNAPSHOT.jar
WORKDIR /app
CMD ["java", "-jar", "application-SNAPSHOT.jar"]
My final container size is 189MB (34 MB Application Jar size + 155 MB Base image size.)

Build docker image with jetty - when should I build?

I'm working on 'dockerizing' a java web application (https://github.com/kermitt2/grobid) which I want to run using jetty.
Here the Dockerfile:
FROM jetty:9.3-jre8
ADD ./grobid-home/target/grobid-home-0.4.1-SNAPSHOT.zip /opt
RUN unzip /opt/grobid-home-0.4.1-SNAPSHOT.zip -d /opt && \
rm /opt/grobid-home-0.4.1-SNAPSHOT.zip && \
apt-get update && apt-get -y --no-install-recommends install libxml2
COPY ./grobid-service/target/grobid-service-0.4.1-SNAPSHOT.war \
/var/lib/jetty/webapps/ROOT.war
The current docker image works perfectly, but it requires the application to be built before (it cannot be built from the fresh git clone).
For example I could not run a build with the docker HUB build system.
What would be the prefereable approach? Build the maven project while building the image or run docker after the build as been successfully finished?

I assume the docker image you are creating is for production.
If you create an image which takes the sources and build the war, you will have to embed :
The JDK
Maven
Your sources
Each of these are completely useless and take a lot of space in your image for absolutely nothing.
So yeah, IMO you only add the war to your docker image, you don't build from within.
I think that you should not build your docker image inside your maven process, it's two separate processes that you can automate with some higher level scripting (or jenkins pipeline)

Docker cache gradle dependencies

I'm trying to deploy our java web application to aws elastic beanstalk using docker, the idea is to be able to run the container locally for development and testing and eventually push it up to production using git.
I've created a base image that has tomcat8 and java8 installed, the image that performs the gradle builds inherit from this base image, speeding up build process.
All works well, except for the fact that the inheriting application container that gets built using docker doesn't seem to cache the gradle dependencies, it downloads it every time, including gradlew. We build our web application using the following command:
./gradlew war
Is there some way that i can cache the files in ~/.gradle this would speed my build up dramatically.
This isn't so much of an issue on beanstalk but is a big problem for devs trying to build and run locally as this does take a lot of time, as you can imagine.
The base image dockerfile:
FROM phusion/baseimage
EXPOSE 8080
RUN apt-get update
RUN add-apt-repository ppa:webupd8team/java
RUN apt-get update
RUN echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | sudo /usr/bin/debconf-set-selections
RUN apt-get -y install oracle-java8-installer
RUN java -version
ENV TOMCAT_VERSION 8.0.9
RUN wget --quiet --no-cookies http://archive.apache.org/dist/tomcat/tomcat-8/v${TOMCAT_VERSION}/bin/apache-tomcat-${TOMCAT_VERSION}.tar.gz -O /tmp/catalina.tar.gz
# Unpack
RUN tar xzf /tmp/catalina.tar.gz -C /opt
RUN mv /opt/apache-tomcat-${TOMCAT_VERSION} /opt/tomcat
RUN ln -s /opt/tomcat/logs /var/log/tomcat
RUN rm /tmp/catalina.tar.gz
# Remove unneeded apps
RUN rm -rf /opt/tomcat/webapps/examples
RUN rm -rf /opt/tomcat/webapps/docs
RUN rm -rf /opt/tomcat/webapps/ROOT
ENV CATALINA_HOME /opt/tomcat
ENV PATH $PATH:$CATALINA_HOME/bin
ENV CATALINA_OPTS $PARAM1
# Start Tomcat
CMD ["/opt/tomcat/bin/catalina.sh", "run"]
The application dockerfile:
FROM <tag name here for base image>
RUN mkdir ~/.gradle
# run some extra stuff here to add things to gradle.properties file
# Add project Source
ADD . /var/app/myapp
# Compile and Deploy Application, this is what is downloading gradlew and all the maven dependencies every time, if only there was a way to take the changes it makes to ~/.gradle and persist it as a cache layer
RUN cd /var/app/myapp/ && ./gradlew war
RUN mv /var/app/myapp/build/libs/myapp.war /opt/tomcat/webapps/ROOT.war
# Start Tomcat
CMD ["/opt/tomcat/bin/catalina.sh", "run"]

I faced this issue. As you might agree, it is a best practice to download dependencies alone as a separate step while building the docker image. It becomes little tricky with gradle, since there is no direct support for downloading just dependencies.
Option 1 : Using docker-gradle Docker image
We can use pre-built gradle docker image to build the application. This ensures that it's not a local system build but a build done on a clean docker image.
docker volume create --name gradle-cache
docker run --rm -v gradle-cache:/home/gradle/.gradle -v "$PWD":/home/gradle/project -w /home/gradle/project gradle:4.7.0-jdk8-alpine gradle build
ls -ltrh ./build/libs
gradle cache is loaded here as a volume. So subsequent builds will reuse the downloaded dependencies.
After this, we could have a Dockerfile to take this artifact and generate application specific image to run the application.
This way, the builder image is not required. Application build flow and Application run flow is separated out.
Since the gradle-cache volume is mounted, we could reuse the downloaded dependencies across different gradle projects.
Option 2 : Multi-stage build
----- Dockerfile -----
FROM openjdk:8 AS TEMP_BUILD_IMAGE
ENV APP_HOME=/usr/app/
WORKDIR $APP_HOME
COPY build.gradle settings.gradle gradlew $APP_HOME
COPY gradle $APP_HOME/gradle
RUN ./gradlew build || return 0
COPY . .
RUN ./gradlew build
FROM openjdk:8
ENV ARTIFACT_NAME=your-application.jar
ENV APP_HOME=/usr/app/
WORKDIR $APP_HOME
COPY --from=TEMP_BUILD_IMAGE $APP_HOME/build/libs/$ARTIFACT_NAME .
EXPOSE 8080
CMD ["java","-jar",$ARTIFACT_NAME]
In the above Dockerfile
First we try to copy the project's gradle files alone, like
build.gradle, gradlew etc.,
Then we copy the gradle directory itself
And then we try to run the build. At this point, there is no other
source code files exists in the directory. So build will fail. But
before that it will download the dependencies.
Since we expect the
build to fail, I have tried a simple technique to return 0 and allow
the docker to continue execution
this will speed up the subsequent build flows, since all the dependencies are downloaded and docker cached this layer. Comparatively, Volume mounting the gradle cache directory is still the best approach.
The above example also showcases multi-stage docker image building, which avoid multiple docker build files.

I
Add resolveDependencies task in build.gradle:
task resolveDependencies {
doLast {
project.rootProject.allprojects.each { subProject ->
subProject.buildscript.configurations.each { configuration ->
configuration.resolve()
}
subProject.configurations.each { configuration ->
configuration.resolve()
}
}
}
}
and update Dockerfile:
ADD build.gradle /opt/app/
WORKDIR /opt/app
RUN gradle resolveDependencies
ADD . .
RUN gradle build -x test --parallel && \
touch build/libs/api.jar
II
Bellow is what I do now:
build.gradle
ext {
speed = project.hasProperty('speed') ? project.getProperty('speed') : false
offlineCompile = new File("$buildDir/output/lib")
}
dependencies {
if (speed) {
compile fileTree(dir: offlineCompile, include: '*.jar')
} else {
// ...dependencies
}
}
task downloadRepos(type: Copy) {
from configurations.all
into offlineCompile
}
Dockerfile
ADD build.gradle /opt/app/
WORKDIR /opt/app
RUN gradle downloadRepos
ADD . /opt/app
RUN gradle build -Pspeed=true

You might want to consider splitting your application image to two images: one for building the myapp.war and the other for running your application. That way, you can use docker volumes during the actual build and bind the host's ~/.gradle folder into the container performing the build. Instead of only one step to run your application, you would have more steps, though. Example:
builder image
FROM <tag name here for base image including all build time dependencies>
# Add project Source
# -> you can use a project specific gradle.properties in your project root
# in order to override global/user gradle.properties
ADD . /var/app/myapp
RUN mkdir -p /root/.gradle
ENV HOME /root
# declare shared volume path
VOLUME /root/.gradle
WORKDIR /var/app/myapp/
# Compile only
CMD ["./gradlew", "war"]
application image
FROM <tag name here for application base image>
ADD ./ROOT.war /opt/tomcat/webapps/ROOT.war
# Start Tomcat
CMD ["/opt/tomcat/bin/catalina.sh", "run"]
How to use in your project root, assuming the builder Dockerfile is located there and the application Dockerfile is located at the webapp subfolder (or any other path you prefer):
$ docker build -t builder .
$ docker run --name=build-result -v ~/.gradle/:/root/.gradle/ builder
$ docker cp build-result:/var/app/myapp/myapp.war webapp/ROOT.war
$ cd webapp
$ docker build -t application .
$ docker run -d -P application
I haven't tested the shown code, but I hope you get the idea. The example might even be improved by using data volumes for the .gradle/ cache, see the Docker user guide for details.

The current version of Docker supports mounting a "cache" and it's local to the Docker environment (so it's not shared with your OS which is both good and bad; good in that there's nothing about your system in the build process, bad in that you have to download again)
This code is from my Spring Docker Swarm integration rework
FROM gradle:7.4-jdk17 AS builder
WORKDIR /w
COPY ./ /w
RUN --mount=type=cache,target=/home/gradle/.gradle/caches gradle build --no-daemon -x test
FROM openjdk:17-jdk as extractor
WORKDIR /w
COPY bin/extract.sh /w/extract.sh
COPY --from=builder /w/*/build/libs/*.jar /w/
RUN sh ./extract.sh
FROM openjdk:17-jdk as sample-service
WORKDIR /w
COPY --from=extractor /w/sample-service/* /w/
ENTRYPOINT ["java", "-XX:MaxRAMPercentage=80", "org.springframework.boot.loader.JarLauncher"]
HEALTHCHECK --interval=5s --start-period=60s \
CMD curl -sfo /dev/null http://localhost:8080/actuator/health
USER 5000
EXPOSE 8080
What this does is from my current folder which is a multi-module gradle build I run the build. extractor stage unbundles the JAR file using extract.sh script below.
Then assembles the relevant component
The relevant contents of extract.sh
#!/bin/sh
set -e
set -x
# Remove support projects that won't be a Spring Boot
# rm buildSrc.jar
# rm gateway-common-*.jar
for jar in *.jar
do
DIR=$(basename $jar -0.0.1-SNAPSHOT.jar)
mkdir $DIR
java -Djarmode=layertools -jar $jar extract --destination $DIR
done

try changing the gradle user home directory
RUN mkdir -p /opt/gradle/.gradle
ENV GRADLE_USER_HOME=/opt/gradle/.gradle

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.