Elegantly activating a virtualenv in a Dockerfile
When you’re packaging your Python application in a Docker image, you’ll often use a
For example, you might be doing a multi-stage build in order to get smaller images.
Since you’re using a
virtualenv, you need to activate it—but if you’re just getting started with Dockerfiles, the naive way doesn’t work.
And even if you do know how to do it, the usual method is repetitive and therefore error-prone.
There is a simpler way of activating a virtualenv, which I’ll demonstrate in this article. But first, we’ll go over some of the other, less elegant (or broken!) ways you might do it.
Note: Outside any specific best practice being demonstrated, the Dockerfiles in this article are not examples of best practices, since the added complexity would obscure the main point of the article.
Make sure your production software is packaged securely, efficiently, and quickly: Read the pragmatic, thorough, and concise Python on Docker Production Handbook.
The method that doesn’t work
If you just blindly convert a shell script into a Dockerfile you will get something that looks right, but is actually broken:
FROM python:3.9-slim-bullseye RUN python3 -m venv /opt/venv # This is wrong! RUN . /opt/venv/bin/activate # Install dependencies: COPY requirements.txt . RUN pip install -r requirements.txt # Run the application: COPY myapp.py . CMD ["python", "myapp.py"]
It’s broken for two different reasons:
RUNline in the Dockerfile is a different process. Running
activatein a separate
RUNhas no effect on future
RUNcalls; for all practical purposes it’s a no-op.
- When you run the resulting Docker image it will run the
CMD—which also isn’t going to be run inside the virtualenv, since it too is unaffected by the
The repetitive method that mostly works
One solution is to explicitly use the path to the binaries in the virtualenv. In this case we only have two repetitions, but in more complex situations you’ll need to do it over and over again.
Besides the lack of readability, repetition is a source of error.
As you add more calls to Python programs, it’s easy to forget to add the magic
It will (mostly) work though:
FROM python:3.9-slim-bullseye RUN python3 -m venv /opt/venv # Install dependencies: COPY requirements.txt . RUN /opt/venv/bin/pip install -r requirements.txt # Run the application: COPY myapp.py . CMD ["/opt/venv/bin/python", "myapp.py"]
The only caveat is that if any Python process launches a sub-process, that sub-process will not run in the virtualenv.
The repetitive method that totally works
You can fix that by actually activating the virtualenv separately for each
RUN as well as the
FROM python:3.9-slim-bullseye RUN python3 -m venv /opt/venv # Install dependencies: COPY requirements.txt . RUN . /opt/venv/bin/activate && pip install -r requirements.txt # Run the application: COPY myapp.py . CMD . /opt/venv/bin/activate && exec python myapp.py
exec is there to get correct signal handling.)
The elegant method, in which we learn what activating actually does
It’s easy to think of
activate as some mysterious magic, a pentacle drawn in blood to keep Python safely trapped.
But it’s just software, and fairly simple software at that.
The virtualenv documentation will even tell you that
activate is “purely a convenience.”
If you go and read the code for
activate, it does a number of things:
- It figures out what shell you’re running.
- It adds a
deactivatefunction to your shell, and messes around with
- It changes the shell prompt to include the virtualenv name.
- It unsets the
PYTHONHOMEenvironment variable, if someone happened to set it.
- It sets two environment variables:
The first four are basically irrelevant to Docker usage, so that just leaves the last item.
Most of the time
VIRTUAL_ENV has no effect, but some tools—e.g. the
poetry packaging tool—use it to detect whether you’re running inside a virtualenv.
The most important part is setting
PATH is a list of directories which are searched for commands to run.
activate simply adds the virtualenv’s
bin/ directory to the start of the list.
We can replace
activate by setting the appropriate environment variables: Docker’s
ENV command applies both subsequent
RUNs as well as to the
The result is the following Dockerfile:
FROM python:3.9-slim-bullseye ENV VIRTUAL_ENV=/opt/venv RUN python3 -m venv $VIRTUAL_ENV ENV PATH="$VIRTUAL_ENV/bin:$PATH" # Install dependencies: COPY requirements.txt . RUN pip install -r requirements.txt # Run the application: COPY myapp.py . CMD ["python", "myapp.py"]
The virtualenv now automatically works for both
CMD, without any repetition or need to remember anything.
Software isn’t magic
And there you have it: a version that is as simple as our original, broken version, but actually does the right thing. No repetition, and less scope for error.
When something seems needlessly complex, dig in and figures out how it works. The software you’re using might be simpler (or more simplistic) than you think, and with a little work you might come up with a more elegant solution.
The concise and pragmatic guide to Docker packaging for production
Docker packaging for production is complicated, with as many as 70+ best practices to get right. And you want small images, fast builds, and your Python application running securely.
Take the fast path to learning best practices, by using the Python on Docker Production Handbook.
Free ebook: Introduction to Dockerizing for Production
Learn a step-by-step iterative DevOps packaging process in this free mini-ebook. You'll learn what to prioritize, the decisions you need to make, and the ongoing organizational processes you need to start.
Plus, you'll join my newsletter and get weekly articles covering practical tools and techniques, from Docker packaging to Python best practices.