Avoiding insecure images from Docker build caching
Docker builds can be slow, so you want to use Docker’s layer caching, reusing previous builds to speed up the current one. And while this will speed up builds, there’s a down-side as well: caching can lead to insecure images.
In this article I’ll cover:
- Why caching can mean insecure images.
- Bypassing Docker’s build cache.
- The process you need in place to keep your images secure.
Note: Outside the topic under discussion, the Dockerfiles in this article are not examples of best practices, since the added complexity would obscure the main point of the article. So if you’re going to be running your Python application in production with Docker, here are two ways to apply best practices:
- If you want to DIY: A detailed quickstart, with an iterative process, 60+ best practices, examples, and references
- If you want a working setup ASAP: A template, with best practices implemented for you
The problem: caching means no updates
I’m going to assume here that you’re using a stable base image, which means package updates are purely focused on security fixes and severe bug fixes. So as a first pass we can assume you actually want these updates to happen on regular basis, because they’re both important and unlikely to break your code.
Consider the following Dockerfile:
FROM ubuntu:18.04 RUN apt-get update && \ apt-get upgrade -y && \ apt-get install -y --no-install-recommends python3 COPY myapp.py . CMD python3 myapp.py
The first time we build it, it will download a variety of Ubuntu packages, which takes a while.
The second time we run it, however,
docker build uses the cached layers (assuming you ensured the cache is populated):
$ docker build -t myimage . Sending build context to Docker daemon 2.56kB Step 1/4 : FROM ubuntu:18.04 ---> 94e814e2efa8 Step 2/4 : RUN apt-get update && apt-get upgrade -y && apt-get install -y --no-install-recommends python3 ---> Using cache ---> 3cea2a611763 Step 3/4 : COPY myapp.py . ---> Using cache ---> f6173b1fa111 Step 4/4 : CMD python3 myapp.py ---> Using cache ---> 6222b50940a5 Successfully built 6222b50940a5 Successfully tagged myimage:latest
Until you change the text of the second line of the Dockerfile (“apt-get update etc.”), every time you do a build that relies on the cache you’ll get the same Ubuntu packages you installed the first time.
As long as you’re relying on caching, you’ll still get the old, insecure packages distributed in your images even after Ubuntu has released security updates.
That suggests that sometimes you’re going to want to bypass the caching.
You can do so by passing two arguments to
--pull: This pulls the latest version of the base Docker image, instead of using the locally cached one.
--no-cache: This ensures all additional layers in the
Dockerfileget rebuilt from scratch, instead of relying on the layer cache.
If you add those arguments to
docker build you will be ensured that the new image has the latest (system-level) packages and security updates.
Rebuild your images regularly
If you want both the benefits of caching, and to get security updates within a reasonable amount of time, you will need two build processes:
- The normal image build process that happens whenever you release new code.
- Once a week, or every night, rebuild your Docker image from scratch using
docker build --pull --no-cacheto ensure you have security updates.
Learn how to build fast, production-ready Docker images—read the rest of the Docker packaging guide for Python.
Docker packaging is complicated, and you can’t afford to screw up production
From fast builds that save you time, to security best practices that keep you safe, how can you quickly gain the expertise you need to package your Python application for production?
Take the fast path to learning best practices, by using the Python on Docker Production Quickstart.
Learn practical Docker and Python software engineering skills, every week
You need to stay competitive in the job market—but there's too much to learn, and you don’t know where to start.
Join over 2100 Python developers and data scientists learning practical tools and techniques, from Docker packaging to Python best practices, with a free new article in your inbox every week.