Builds require a few properties, chief among them reproducibility. I would consider speed to be low on the order of priorities. However, it’s also one of the most limiting factors to your release cycle: if your build takes T, you cannot release faster than each T. Hence, you’ll probably want to speed up your builds after you’ve reached a certain maturity level to enable more frequent releases.
I want to detail some techniques you can leverage to make your Maven builds faster in this article, outside and then inside of Docker.
Since I want to propose techniques and evaluate their impact, we need a sample repository. I’ve chosen Hazelcast code samples because it provides a large enough multi-modules code base with many submodules; the exact commit is 448febd.
The rules are the following:
- I run the command five times to avoid temporary issues
- I execute
mvn cleanbetween each run to start from an empty
- All dependencies and plugins are already downloaded
- I report the time that Maven displays in the console log:
[INFO] ------------------------------------------------------- [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------- [INFO] Total time: 22.456 s (Wall Clock) [INFO] Finished at: 2021-09-24T23:20:41+02:00 [INFO] -------------------------------------------------------
Let’s start with our baseline,
mvn test. The results are:
- 02:00 min
- 01:57 min
- 01:58 min
- 01:56 min
- 01:58 min
Using all CPUs
By default, Maven uses a single thread. In the age of multicores, this is just waste. It’s possible to run parallel builds using multiple threads by setting an absolute number or a number relative to the number of available cores. For more information, please check the relevant documentation.
The more submodules that are not dependent on each other you have, i.e., Maven can build them in parallel, the better you’ll achieve with this technique. It fits our codebase very well.
We are going to use as many threads as there are available cores. The relevant command is
mvn test -T 1C.
When the command starts, you should see the following message in the console:
Using the MultiThreadedBuilder implementation with a thread count of X
- 51.487 s (Wall Clock)
- 40.322 s (Wall Clock)
- 52.468 s (Wall Clock)
- 41.862 s (Wall Clock)
- 41.699 s (Wall Clock)
The numbers are much better but with a higher variance.
Parallel test execution
Parallelization is an excellent technique. We can do the same regarding test execution. By default, the Maven Surefire plugin runs tests sequentially, but it’s possible to configure it to run tests in parallel. Please refer to the documentation for the whole set of options.
This approach is excellent if you’ve got a large number of units in each module. Note that your tests need to be independent of one another.
We will manually set the number of threads:
mvn test -Dparallel=all -DperCoreThreadCount=false -DthreadCount=16 #1 #2
- Configure Surefire to run both classes and methods in parallel
- Manual override the thread count to 16
Let’s run it:
- 02:04 min
- 02:03 min
- 01:46 min
- 01:52 min
- 01:53 min
It seems that the cost of thread synchronization offsets the potential gain of running parallel tests.
Maven will check whether a
SNAPSHOT dependency has a new “version” at every run. It means additional network roundtrips. We can prevent this check with the
While you should avoid
SNAPSHOT dependencies, it’s sometimes unavoidable, especially during development.
The command is
mvn test -o,
-o being the shortcut for
- 01:46 min
- 01:46 min
- 01:47 min
- 01:55 min
- 01:44 min
The codebase has a considerable number of
SNAPSHOT dependencies; hence offline speeds up the build significantly.
Maven itself is a Java-based application. It means each run starts a new JVM. A JVM first interprets the bytecode and then analyze the workload and compiles the bytecode to native code accordingly: it means peak performance, but only after a (long) while. It’s great for long-running processes, not so much for command-line applications.
We will likely not reach the peak performance point in the context of builds since they are relatively short-lived, but we are still paying for the analysis cost. We can configure Maven to forego it by configuring the adequate JVM parameters. Several ways of configuring the JVM are available. The most straightforward way is to create a dedicated
jvm.config configuration file in a
.mvn subfolder in the project’s folder.
Let’s now simply run
- 01:44 min
- 01:44 min
- 01:53 min
- 01:53 min
- 01:55 min
Gradle runs on the Java Virtual Machine (JVM) and uses several supporting libraries that require a non-trivial initialization time. As a result, it can sometimes seem a little slow to start. The solution to this problem is the Gradle Daemon: a long-lived background process that executes your builds much more quickly than would otherwise be the case. We accomplish this by avoiding the expensive bootstrapping process and leveraging caching by keeping data about your project in memory.
The Gradle team recognized early that a command-line tool was not the best usage of the JVM. To fix that, one keeps a JVM background process, the daemon, always up. It acts as a server while the CLI itself plays the role of the client.
As an additional benefit, such a long-running process loads classes only once (if they didn’t change between runs).
Once you have installed the software, you can run the daemon with the
mvnd command instead of the standard
mvn one. Here are the results with
- 33.124 s (Wall Clock)
- 33.114 s (Wall Clock)
- 34.440 s (Wall Clock)
- 32.025 s (Wall Clock)
- 29.364 s (Wall Clock)
Note that the daemon uses multiple threads by default, with
number of cores - 1.
Mixing and matching
We’ve seen several ways to speed up the build. What if we used them in conjunction?
Let’s first try with every technique we’ve seen so far in the same run:
mvnd test -Dparallel=all -DperCoreThreadCount=false -DthreadCount=16 -o #1 #2 #3 #4
- Use the Maven daemon
- Run the tests in parallel
- Don’t update
- Configure the JVM parameters as above via the
jvm.configfile – no need to set any option
The command returns the following results:
- 27.061 s (Wall Clock)
- 24.457 s (Wall Clock)
- 24.853 s (Wall Clock)
- 25.772 s (Wall Clock)
Thinking about it, the Maven daemon is a long-running process. For that reason, it stands to reason to let the JVM analyze and compile the bytecode to native code. We can thus remove the
jvm.config file and re-run the above command. Results are:
- 23.840 s (Wall Clock)
- 26.589 s (Wall Clock)
- 22.283 s (Wall Clock)
- 23.788 s (Wall Clock)
- 22.456 s (Wall Clock)
Now we can display the consolidated results:
|Baseline||Parallel Build||Parallel tests||Offline||JVM params||Daemon||Daemon + offline + parallel tests + parameters||Daemon + offline + parallel tests|
|Gain from baseline (s)||–||72.40||1.40||10.20||8.00||85.39||92.26||94.01|
Raw Maven summary
At this point, we have seen several ways to speed up your Maven build:
- Maven daemon: A solid, safe starting point
- Parallelize builds: When the build contains multiple modules that are independent of each other
- Parallelize tests: When the project contains multiple tests
- Offline: When the project contains
SNAPSHOTdependencies and you don’t need to update them
- JVM parameters: When you want to go the extra mile
I’d advise every user to start using the Maven daemon and continue optimizing if necessary and depending on your project.
Now, I’d like to widen the scope and do the same for Maven builds inside Docker.
Between each run, we change the source code by adding a single blank line; between each section, we remove all built images, including the intermediate ones that are the results of the multi-stage build. The idea is to avoid reusing a previously built image.
To compute a helpful baseline, we need a sample project. I created one just for this purpose: it’s a relatively small Kotlin project.
Here’s the relevant
FROM openjdk:11-slim-buster as build #1 COPY .mvn .mvn #2 COPY mvnw . #2 COPY pom.xml . #2 COPY src src #2 RUN ./mvnw -B package #3 FROM openjdk:11-jre-slim-buster #4 COPY --from=build target/fast-maven-builds-1.0.jar . #5 EXPOSE 8080 ENTRYPOINT ["java", "-jar", "fast-maven-builds-1.0.jar"] #6
- Start from a JDK image for the packaging step
- Add required resources
- Create the JAR
- Start from a JRE for image creation step
- Copy the JAR from the previous step
- Set the entry point
Let’s execute the build:
time DOCKER_BUILDKIT=0 docker build -t fast-maven:1.0 . #1
- Forget the environment variable for now, as I’ll explain in the next section
Here are the results of the five runs:
* 0.36s user 0.53s system 0% cpu 1:53.06 total * 0.36s user 0.56s system 0% cpu 1:52.50 total * 0.35s user 0.55s system 0% cpu 1:56.92 total * 0.36s user 0.56s system 0% cpu 2:04.55 total * 0.38s user 0.61s system 0% cpu 2:04.68 total
Buildkit for the win
The last command line used the
DOCKER_BUILDKIT environment variable. It’s the way to tell Docker to use the legacy engine. If you didn’t update Docker for some time, it’s the engine that you’re using. Nowadays, BuildKit has superseded it and is the new default.
BuildKit brings several performance improvements:
- Automatic garbage collection
- Concurrent dependency resolution
- Efficient instruction caching
- Build cache import/export
Let’s re-execute the previous command on the new engine:
time docker build -t fast-maven:1.1 .
Here’s an excerpt of the console log of the first run:
... => => transferring context: 4.35kB => [build 2/6] COPY .mvn .mvn => [build 3/6] COPY mvnw . => [build 4/6] COPY pom.xml . => [build 5/6] COPY src src => [build 6/6] RUN ./mvnw -B package ... 0.68s user 1.04s system 1% cpu 2:06.33 total
The following executions of the same command have a slightly different output:
... => => transferring context: 1.82kB => CACHED [build 2/6] COPY .mvn .mvn => CACHED [build 3/6] COPY mvnw . => CACHED [build 4/6] COPY pom.xml . => [build 5/6] COPY src src => [build 6/6] RUN ./mvnw -B package ...
Remember that we change the source code between runs. Files that we do not change, namely
pom.xml, are cached by BuildKit. But these resources are small, so that caching doesn’t significantly improve the build time.
* 0.69s user 1.01s system 1% cpu 2:05.08 total * 0.65s user 0.95s system 1% cpu 1:58.51 total * 0.68s user 0.99s system 1% cpu 1:59.31 total * 0.64s user 0.95s system 1% cpu 1:59.82 total
A fast glance at the logs reveals that the biggest bottleneck in the build is the download of all dependencies (including plugins). It occurs every time we change the source code. That’s the reason why BuildKit doesn’t improve the performance.
Layers, layers, layers
We should focus our efforts on the dependencies. For that, we can leverage layers and split the build into two steps:
- In the first step, we download dependencies
- In the second one, we do the proper packaging
Each step creates a layer, the second depending on the first.
With layering, if we change the source code in the second layer, the first layer is not impacted and can be reused. We don’t need to download dependencies again. The new
Dockerfile looks like:
FROM openjdk:11-slim-buster as build COPY .mvn .mvn COPY mvnw . COPY pom.xml . RUN ./mvnw -B dependency:go-offline #1 COPY src src RUN ./mvnw -B package #2 FROM openjdk:11-jre-slim-buster COPY --from=build target/fast-maven-builds-1.2.jar . EXPOSE 8080 ENTRYPOINT ["java", "-jar", "fast-maven-builds-1.2.jar"]
go-offlinegoal downloads all dependencies and plugins
- At this point, all dependencies are available
go-offline doesn’t download everything. The command won’t run successfully if you try to use the
-o option (for offline). It’s a well-known old bug. In all cases, it’s “good enough”.
Let’s run the build:
time docker build -t fast-maven:1.2 .
The first run takes significantly more time than the baseline:
0.84s user 1.21s system 1% cpu 2:35.47 total
However, the subsequent builds are much faster. Changing the source code only affects the second layer and doesn’t trigger the download of (most) dependencies:
* 0.23s user 0.36s system 5% cpu 9.913 total * 0.21s user 0.33s system 5% cpu 9.923 total * 0.22s user 0.38s system 6% cpu 9.990 total * 0.21s user 0.34s system 5% cpu 9.814 total * 0.22s user 0.37s system 5% cpu 10.454 total
Volume mount in build
Layering the build improved the build time drastically. We can change the source code and keep it low. There’s one remaining issue, though. Changing a single dependency invalidates the layer, so we need to download all of them again.
Fortunately, BuildKit introduces volumes mount during the build (and not only during the run). Several types of mounts are available, but the one that interests us is the cache mount. It’s an experimental feature, so you need to explicitly opt-in:
# syntax=docker/dockerfile:experimental #1 FROM openjdk:11-slim-buster as build COPY .mvn .mvn COPY mvnw . COPY pom.xml . COPY src src RUN --mount=type=cache,target=/root/.m2,rw ./mvnw -B package #2 FROM openjdk:11-jre-slim-buster COPY --from=build target/fast-maven-builds-1.3.jar . EXPOSE 8080 ENTRYPOINT ["java", "-jar", "fast-maven-builds-1.3.jar"]
- Opt-in to experimental features
- Build using the cache
It’s time to run the build:
time docker build -t fast-maven:1.3 .
The build time is higher than for the regular build but still lower than the layers build:
0.71s user 1.01s system 1% cpu 1:50.50 total
The following builds are on par with layers:
* 0.22s user 0.33s system 5% cpu 9.677 total * 0.30s user 0.36s system 6% cpu 10.603 total * 0.24s user 0.37s system 5% cpu 10.461 total * 0.24s user 0.39s system 6% cpu 10.178 total * 0.24s user 0.35s system 5% cpu 10.283 total
However, as opposed to layers, we only need to download updated dependencies. Here, let’s change Kotlin’s version from
<properties> <kotlin.version>1.5.31</kotlin.version> </properties>
It’s a huge improvement regarding the build time:
* 0.41s user 0.57s system 2% cpu 44.710 total
Considering the Maven daemon
In the previous post regarding regular Maven builds, I mentioned the Maven daemon. Let’s change our build accordingly:
FROM openjdk:11-slim-buster as build ADD https://github.com/mvndaemon/mvnd/releases/download/0.6.0/mvnd-0.6.0-linux-amd64.zip . #1 RUN apt-get update \ #2 && apt-get install unzip \ #3 && mkdir /opt/mvnd \ #4 && unzip mvnd-0.6.0-linux-amd64.zip \ #5 && mv mvnd-0.6.0-linux-amd64/* /opt/mvnd #6 COPY .mvn .mvn COPY mvnw . COPY pom.xml . COPY src src RUN /opt/mvnd/bin/mvnd -B package #7 FROM openjdk:11-jre-slim-buster COPY --from=build target/fast-maven-builds-1.4.jar . EXPOSE 8080 ENTRYPOINT ["java", "-jar", "fast-maven-builds-1.4.jar"]
- Download the latest version of the Maven daemon
- Refresh the package index
- Create a dedicated folder
- Extract the archive that we downloaded in step #1
- Move the content of the extracted archive to the previously created folder
mvndinstead of the Maven wrapper
Let’s run the build now:
docker build -t fast-maven:1.4 .
The log outputs the following:
* 0.70s user 1.01s system 1% cpu 1:51.96 total * 0.72s user 0.98s system 1% cpu 1:47.93 total * 0.66s user 0.93s system 1% cpu 1:46.07 total * 0.76s user 1.04s system 1% cpu 1:50.35 total * 0.80s user 1.18s system 1% cpu 2:01.45 total
There’s no significant improvement compared to the baseline.
I tried to create a dedicated
mvnd image and use it as a parent image:
# docker build -t mvnd:0.6.0 . FROM openjdk:11-slim-buster as build ADD https://github.com/mvndaemon/mvnd/releases/download/0.6.0/mvnd-0.6.0-linux-amd64.zip . RUN --mount=type=cache,target=/var/cache/apt,rw apt-get update \ && apt-get install unzip \ && mkdir /opt/mvnd \ && unzip mvnd-0.6.0-linux-amd64.zip \ && mv mvnd-0.6.0-linux-amd64/* /opt/mvnd
# docker build -t fast-maven:1.5 . FROM mvnd:0.6.0 as build # ...
This approach changes the output in any significant way.
mvnd is only good when the daemon is up during several runs. I found no way to do that with Docker. If you’ve any idea on how to achieve it, please tell me; extra points if you can point me to an implementation.
Here’s the summary of all execution times:
|Gain from baseline (s)||0||-2.34||108.43||108.10||6.79|
Speeding up the performance of Maven builds inside of Docker is pretty different from regular builds. In Docker, the limiting factor is the download speed of dependencies If you’re stuck on an old version, you need to use layers to cache dependencies.
With BuildKit, I recommend using the new cache mount capability to avoid downloading all dependencies if the layer is invalidated.
The complete source code for this post can be found on Github in Maven format.
To go further:
- Parallel builds in Maven 3
- Surefire Parallel Test Execution
- mvnd – the Maven Daemon
- Introducing BuildKit
- Docker Layers Explained
- Build Mounts