Introduction to Dockerizing for Production

Improve your DevOps skills: learn an iterative process for Dockerizing your code.

Poetry vs. Docker caching: Fight!

by Itamar Turner-Trauring
Last updated 30 Jan 2025, originally created 05 Nov 2020

Docker packaging is an exercise in shoving square pegs into round holes, over and over and over again.

Consider the Poetry packaging tool for Python. One of Poetry’s features can make Docker rebuilds slower, by breaking Docker’s caching.

And it’s not a bad feature, there’s nothing really wrong with it, it just—doesn’t fit.

Let’s see what the problem is, go over some workarounds—which have their own problems, obviously—and then briefly consider why everything about Docker packaging is always slightly broken.

Recap: faster rebuilds by installing dependencies separately

As a reminder:

When you rebuild a Docker image it can use caching to speed up the rebuild process. The caching will be invalidated if you COPY in a changed file.
When installing your dependencies and code, you’ll therefore want to copy in the dependencies file first, and separately. This lets dependency installation can be sped up by caching even if your code changes.

For example, we copy requirements.txt in first, and install dependencies using it, then COPY in the rest of the code:

FROM python:3.13-slim
COPY requirements.txt /tmp
RUN pip install -r requirements.txt
COPY . /tmp/myapp
RUN pip install /tmp/myapp

Note: Outside any specific best practice being demonstrated, the Dockerfiles in this article are not examples of best practices, since the added complexity would obscure the main point of the article.

Need to ship quickly, and don’t have time to figure out every detail on your own? Read the concise, action-oriented Python on Docker Production Handbook.

Poetry time

Let’s see how we do this two-step install with Poetry.

Poetry has two relevant files.

The standard pyproject.toml Python config file with Poetry-specific configuration has your high-level dependencies.
poetry.lock contains pinned versions of all transitive dependencies.

We’ll have to copy them both in:

FROM python:3.8-slim-buster

WORKDIR /app

# Install poetry:
RUN pip install poetry

# Copy in the config files:
COPY pyproject.toml poetry.lock ./
# Install only dependencies:
RUN poetry install --no-root --no-dev

# Copy in everything else and install:
COPY . .
RUN poetry install --no-dev

So far, so good: unless our dependencies change, thereby changing pyproject.toml and poetry.lock, Docker image rebuilds will be able to use cached layers because the two copied files won’t have changed.

But there’s a problem.

`pyproject.toml`: more than just dependencies

As mentioned above, pyproject.toml is where you list dependencies when you’re using Poetry. Let’s take a look at an example:

[tool.poetry]
name = "myexample"
version = "0.1.0"
description = ""
authors = ["Itamar Turner-Trauring"]

[tool.poetry.dependencies]
python = "^3.6"
Flask = "^1.1.2"

# ...

Do you spot the problem?

There’s a version field for your application.
Every time you update that version field, your pyproject.toml changes.
This invalidates the Docker cache when you rebuild your image.
As a result, your Docker build has to install all your dependencies, slowing things down.

Now, quite possibly you only update that field infrequently, and you can live with occasional slow rebuilds. But if you’re doing some sort of continuous deployment process where you’re continuously updating the version field, your Docker builds are going to be slow.

Some workarounds

First, as mentioned above, you can choose not to care.

Second, instead of installing dependencies with Poetry, you can install them with pip. Specifically, you can use the poetry export plugin to create a standalone requirements.txt, and then just copy the requirements.txt in instead of pyproject.toml and poetry.lock.

The downside is that you need Poetry installed both in and outside the Docker image in your CI build, and this isn’t quite how Poetry normally installs.

Third, you can use poetry-dynamic-versioning, a plug-in for Poetry that uses Git tags instead of pyproject.toml to set your application’s version. That way you won’t have to edit pyproject.toml to update the version.

This seems appealing until you realize you now need to copy .git into your Docker build, which has its own downsides, like larger images unless you’re using multi-stage builds.

Fourth, newer Poetry versions have a non-package mode. When package-mode = false in the configuration, Poetry is only used to manage dependencies. That means it won’t install the package itself, and as a side-effect you can just omit the version field.

Why is everything broken?

A consistent theme with Docker packaging is that nothing works quite right. Docker packaging interacts badly with everything from Unix signals—a 50-year-old technology!—to quite recent projects like Poetry.

So why is that? Partially, it’s because these technologies have their own issues. For example, the interaction of Unix signals, shells, and terminals is extremely complex to the point where I immediately forget how it works every time I attempt to (re)learn it.

But the problem with Poetry is arguably down to the way Docker’s build works: Dockerfiles are essentially glorified shell scripts, and the build system semantic units are files and complete command runs. There is no way in a normal Docker build to access the actually relevant semantic information: in a better build system, you’d only re-install the changed dependencies, not reinstall all dependencies anytime the list changed.

Hopefully someday a better build system will eventually replace the Docker default. Until then, it’s square pegs into round holes.

The concise and action-oriented guide to Docker packaging for production

Docker packaging for production is complicated, with as many as 70+ best practices to get right. And you want small images, fast builds, and your Python application running securely.

Take the fast path to learning best practices, by using the Python on Docker Production Handbook.

Free ebook: "Introduction to Dockerizing for Production"

Learn a step-by-step iterative DevOps packaging process in this free mini-ebook. You'll learn what to prioritize, the decisions you need to make, and the ongoing organizational processes you need to start.

Plus, you'll join over 8000 people getting weekly emails covering practical tools and techniques, from Docker packaging to Python best practices.