Speeding up Docker builds in CI with BuildKit

No one enjoys waiting, and waiting for your software to build and tests to run isn’t fun either—in fact, it’s quite expensive. And if you’re building your Docker image in a CI system like GitHub Actions with ephemeral runners—where a new environment gets spinned up for every build—by default your builds are going to be extra slow.

In particular, when you spin up a new VM with a new Docker instance, the cache is empty, so when you run the Docker build your image has to be built from scratch.

Luckily, Docker includes some features to allow you to warm up the cache, by pulling previous versions of the image. And the newer BuildKit build system improves this even further—but also requires some changes, otherwise caching will stop working. And as of Docker 23.0, BuildKit is enabled by default.

Let’s see how you can speed up your Docker builds in CI with classic Docker, and then the improvements (and pitfall) provided by BuildKit.

Why building Docker images in CI can be slow

When you rebuild an existing image, Docker can look in its local cache for existing layers and reuse those if nothing has changed. This allows for faster builds.

However, in many cases CI runs on a new virtual machine or environment on every run. For example, whenever you run a task in GitHub Actions by default you will be using a new virtual machine. A new virtual machine means a new Docker install, and a new Docker install has an empty cache.

An empty cache means your image will be rebuilt from scratch—and that’s slow.

Speeding up CI builds in classic Docker

In order to make it easier to test things, I’m going to spin up a Docker registry on my computer, equivalent to hub.docker.com or a cloud image registry:

$ docker run -d -p 5000:5000 --name registry registry:2
fb90defa3e7543accbafc15eb94d6c090204f0002c884851804a38e7f8d3fed9

Here’s the Dockerfile we’re going to be building; it’s set up so that if the code changes but requirements.txt is the same, Docker will be able to use the cached layer with the installed dependencies:

FROM python:3.9-slim-buster
COPY requirements.txt .
RUN pip install --quiet -r requirements.txt
COPY . .
ENTRYPOINT ["python", "app.py"]

Note: Outside any specific best practice being demonstrated, the Dockerfiles in this article are not examples of best practices, since the added complexity would obscure the main point of the article.

Python on Docker Production Handbook Need to ship quickly, and don’t have time to figure out every detail on your own? Read the concise, action-oriented Python on Docker Production Handbook.

In order to simulate building in an ephemeral, newly created VM, we’re going to use the following script to clear the cache in between builds:

#!/bin/bash
set -euo pipefail

# Simulate a newly created virtual machine with empty cache.
# The '|| true' allows the shell script to continue if that
# fails because the image doesn't exist.
docker image rm myapp || true
docker image rm localhost:5000/myapp || true
docker image prune -f
docker buildx prune -f  # clear buildkit cache

Here’s out first pass at a build script:

#!/bin/bash
set -euo pipefail

# Build the image:
docker build -t myapp .

# Push to registry:
docker tag myapp localhost:5000/myapp
docker push localhost:5000/myapp

Let’s run this a couple of times; if caching was working correctly this would run really quickly the second time:

$ time ./build.sh
...
real    0m25.124s
user    0m0.130s
sys     0m0.093s
$ ./clear-cache.sh
...
$ time ./build.sh
...
real    0m25.178s
user    0m0.113s
sys     0m0.098s

As expected, because we’re simulating a new VM with an empty cache, the second build is no faster.

Warming the cache

In order to speed up the builds, we need to “warm” the cache. In classic Docker we do this by:

  1. pulling the image.
  2. Using the --cache-from flag to tell docker build to use the pulled image as a source of cached layers.

We modify build.sh appropriately:

#!/bin/bash
set -euo pipefail

# Pull the image:
docker pull localhost:5000/myapp

# Build the image:
docker build -t myapp --cache-from localhost:5000/myapp .

# Push to registry:
docker tag myapp localhost:5000/myapp
docker push localhost:5000/myapp

Now if we run the script:

$ ./clear-cache.sh
...
$ time ./build-caching.sh
...
real    0m1.817s
user    0m0.160s
sys     0m0.121s

That’s a lot faster!

BuildKit: faster, but with pitfalls

Let’s consider how classic caching works: we need to retrieve the whole image. If for example our code has changed, we don’t actually need to download the layer where the code is installed, since we’re not going to reuse it.

Ideally we could just point Docker at the image registry as part of the build, and it would only download the layers it was actually going to reuse. Technically this was possible with classic Docker, but apparently it was buggy and unreliable.

With BuildKit, the new build system for Docker, this is a built-in feature: you can skip the docker pull and just have the build pull the layers it needs.

There is a pitfall, though: by default BuildKit doesn’t include the information needed to reuse images for caching. In order to do so, you have to add an extra flag, --build-arg BUILDKIT_INLINE_CACHE=1, otherwise caching won’t work at all, whether or not you’ve pulled.

Here’s our new build script:

#!/bin/bash
set -euo pipefail

# Enable BuildKit (unnecessary in 23.0 and later):
export DOCKER_BUILDKIT=1

# Build the image; no pull needed, just make sure
# --cache-from has the full image name you would pull/push:
docker build -t myapp \
       --cache-from localhost:5000/myapp \
       --build-arg BUILDKIT_INLINE_CACHE=1 \
       .

# Push to registry:
docker tag myapp localhost:5000/myapp
docker push localhost:5000/myapp

Let’s try it out:

$ ./clear-cache.sh
...
$ time ./build-buildkit.sh
...
real    0m23.617s
user    0m0.095s
sys     0m0.088s
$ ./clear-cache.sh
...
$ time ./build-buildkit.sh
...
real    0m1.641s
user    0m0.080s
sys     0m0.051s

The first build doesn’t use any of the classic Docker caching so it takes the full amount of time. The second build is sped up—but we didn’t have to do an explicit pull!

In many cases that can speed up builds, as BuildKit can pull only the layers it needs. If you’re using multi-stage builds BuildKit will do some of the build in parallel, giving more opportunities for a speed-up.

Go faster!

If you’re building Docker images in CI, and each CI run starts with an empty cache, make sure you’re using these techniques to keep your cache warm. You’ll get faster builds, save a little money, and save a little CO₂ too.