Introduction to Dockerizing for Production

Improve your DevOps skills: learn an iterative process for Dockerizing your code.

Elegantly activating a virtualenv in a Dockerfile

by Itamar Turner-Trauring
Last updated 13 Sep 2024, originally created 20 Mar 2019

When you’re packaging your Python application in a Docker image, you’ll often use a virtualenv. For example, you might be doing a multi-stage build in order to get smaller images.

Since you’re using a virtualenv, you need to activate it—but if you’re just getting started with Dockerfiles, the naive way doesn’t work. And even if you do know how to do it, the usual method is repetitive and therefore error-prone.

There is a simpler way of activating a virtualenv, which I’ll demonstrate in this article. But first, we’ll go over some of the other, less elegant (or broken!) ways you might do it.

Note: Outside any specific best practice being demonstrated, the Dockerfiles in this article are not examples of best practices, since the added complexity would obscure the main point of the article.

Need to ship quickly, and don’t have time to figure out every detail on your own? Read the concise, action-oriented Python on Docker Production Handbook.

The method that doesn’t work

If you just blindly convert a shell script into a Dockerfile you will get something that looks right, but is actually broken:

FROM python:3.12-slim
RUN python3 -m venv /opt/venv

# This is wrong!
RUN . /opt/venv/bin/activate

# Install dependencies:
COPY requirements.txt .
RUN pip install -r requirements.txt

# Run the application:
COPY myapp.py .
CMD ["python", "myapp.py"]

It’s broken for two different reasons:

Every RUN line in the Dockerfile is a different process. Running activate in a separate RUN has no effect on future RUN calls; for all practical purposes it’s a no-op.
When you run the resulting Docker image it will run the CMD—which also isn’t going to be run inside the virtualenv, since it too is unaffected by the RUN processes.

The repetitive method that mostly works

One solution is to explicitly use the path to the binaries in the virtualenv. In this case we only have two repetitions, but in more complex situations you’ll need to do it over and over again.

Besides the lack of readability, repetition is a source of error. As you add more calls to Python programs, it’s easy to forget to add the magic /opt/venv/bin/ prefix.

It will (mostly) work though:

FROM python:3.12-slim

RUN python3 -m venv /opt/venv

# Install dependencies:
COPY requirements.txt .
RUN /opt/venv/bin/pip install -r requirements.txt

# Run the application:
COPY myapp.py .
CMD ["/opt/venv/bin/python", "myapp.py"]

The only caveat is that if any Python process launches a sub-process, that sub-process will not run in the virtualenv.

The repetitive method that totally works

You can fix that by actually activating the virtualenv separately for each RUN as well as the CMD:

FROM python:3.12-slim

RUN python3 -m venv /opt/venv

# Install dependencies:
COPY requirements.txt .
RUN . /opt/venv/bin/activate && pip install -r requirements.txt

# Run the application:
COPY myapp.py .
CMD . /opt/venv/bin/activate && exec python myapp.py

(The exec is there to get correct signal handling.)

The elegant method, in which we learn what activating actually does

It’s easy to think of activate as some mysterious magic, a pentacle drawn in blood to keep Python safely trapped. But it’s just software, and fairly simple software at that. The virtualenv documentation will even tell you that activate is “purely a convenience.”

If you go and read the code for activate, it does a number of things:

It figures out what shell you’re running.
It adds a deactivate function to your shell, and messes around with pydoc.
It changes the shell prompt to include the virtualenv name.
It unsets the PYTHONHOME environment variable, if someone happened to set it.
It sets two environment variables: VIRTUAL_ENV and PATH.

The first four are basically irrelevant to Docker usage, so that just leaves the last item. Most of the time VIRTUAL_ENV has no effect, but some tools—e.g. the poetry packaging tool—use it to detect whether you’re running inside a virtualenv.

The most important part is setting PATH: PATH is a list of directories which are searched for commands to run. activate simply adds the virtualenv’s bin/ directory to the start of the list.

We can replace activate by setting the appropriate environment variables: Docker’s ENV command applies both subsequent RUNs as well as to the CMD.

The result is the following Dockerfile:

FROM python:3.12-slim

ENV VIRTUAL_ENV=/opt/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"

# Install dependencies:
COPY requirements.txt .
RUN pip install -r requirements.txt

# Run the application:
COPY myapp.py .
CMD ["python", "myapp.py"]

The virtualenv now automatically works for both RUN and CMD, without any repetition or need to remember anything.

Software isn’t magic

And there you have it: a version that is as simple as our original, broken version, but actually does the right thing. No repetition, and less scope for error.

When something seems needlessly complex, dig in and figures out how it works. The software you’re using might be simpler (or more simplistic) than you think, and with a little work you might come up with a more elegant solution.

The concise and action-oriented guide to Docker packaging for production

Docker packaging for production is complicated, with as many as 70+ best practices to get right. And you want small images, fast builds, and your Python application running securely.

Take the fast path to learning best practices, by using the Python on Docker Production Handbook.

Free ebook: "Introduction to Dockerizing for Production"

Learn a step-by-step iterative DevOps packaging process in this free mini-ebook. You'll learn what to prioritize, the decisions you need to make, and the ongoing organizational processes you need to start.

Plus, you'll join over 8000 people getting weekly emails covering practical tools and techniques, from Docker packaging to Python best practices.