Elegantly activating a virtualenv in a Dockerfile
When you’re packaging your Python application in a Docker image, you’ll often use a
For example, you might be doing a multi-stage build in order to get smaller images.
Since you’re using a
virtualenv, you need to activate it—but if you’re just getting started with Dockerfiles, the naive way doesn’t work.
And even if you do know how to do it, the usual method is repetitive and therefore error-prone.
There is a simpler way of activating a virtualenv, which I’ll demonstrate in this article. But first, we’ll go over some of the other, less elegant (or broken!) ways you might do it.
Note: Outside the topic under discussion, the Dockerfiles in this article are not examples of best practices, since the added complexity would obscure the main point of the article. So if you’re going to be running your Python application in production with Docker, here are two ways to apply best practices:
- If you want to DIY: A detailed checklist, with examples and references
- If you want a working setup ASAP: A template, with best practices implemented for you
The method that doesn’t work
If you just blindly convert a shell script into a Dockerfile you will get something that looks right, but is actually broken:
FROM ubuntu:18.04 RUN apt-get update && apt-get install \ -y --no-install-recommends python3 python3-virtualenv RUN python3 -m virtualenv --python=/usr/bin/python3 /opt/venv # This is wrong! RUN . /opt/venv/bin/activate # Install dependencies: COPY requirements.txt . RUN pip install -r requirements.txt # Run the application: COPY myapp.py . CMD ["python", "myapp.py"]
It’s broken for two different reasons:
RUNline in the Dockerfile is a different process. Running
activatein a separate
RUNhas no effect on future
RUNcalls; for all practical purposes it’s a no-op.
- When you run the resulting Docker image it will run the
CMD—which also isn’t going to be run inside the virtualenv, since it too is unaffected by the
The repetitive method that mostly works
One solution is to explicitly use the path to the binaries in the virtualenv. In this case we only have two repetitions, but in more complex situations you’ll need to do it over and over again.
Besides the lack of readability, repetition is a source of error.
As you add more calls to Python programs, it’s easy to forget to add the magic
It will (mostly) work though:
FROM ubuntu:18.04 RUN apt-get update && apt-get install \ -y --no-install-recommends python3 python3-virtualenv RUN python3 -m virtualenv --python=/usr/bin/python3 /opt/venv # Install dependencies: COPY requirements.txt . RUN /opt/venv/bin/pip install -r requirements.txt # Run the application: COPY myapp.py . CMD ["/opt/venv/bin/python", "myapp.py"]
The only caveat is that if any Python process launches a sub-process, that sub-process will not run in the virtualenv.
The repetitive method that totally works
You can fix that by actually activating the virtualenv separately for each
RUN as well as the
FROM ubuntu:18.04 RUN apt-get update && apt-get install \ -y --no-install-recommends python3 python3-virtualenv RUN python3 -m virtualenv --python=/usr/bin/python3 /opt/venv # Install dependencies: COPY requirements.txt . RUN . /opt/venv/bin/activate && pip install -r requirements.txt # Run the application: COPY myapp.py . CMD . /opt/venv/bin/activate && exec python myapp.py
exec is there to get correct signal handling.)
The elegant method, in which we learn what activating actually does
It’s easy to think of
activate as some mysterious magic, a pentacle drawn in blood to keep Python safely trapped.
But it’s just software, and fairly simple software at that.
The virtualenv documentation will even tell you that
activate is “purely a convenience.”
If you go and read the code for
activate, it does a number of things:
- It figures out what shell you’re running.
- It adds a
deactivatefunction to your shell, and messes around with
- It changes the shell prompt to include the virtualenv name.
- It unsets the
PYTHONHOMEenvironment variable, if someone happened to set it.
- It sets two environment variables:
The first four are basically irrelevant to Docker usage, so that just leaves the last item.
Most of the time
VIRTUAL_ENV has no effect, but some tools—e.g. the
poetry packaging tool—use it to detect whether you’re running inside a virtualenv.
The most important part is setting
PATH is a list of directories which are searched for commands to run.
activate simply adds the virtualenv’s
bin/ directory to the start of the list.
We can replace
activate by setting the appropriate environment variables: Docker’s
ENV command applies both subsequent
RUNs as well as to the
The result is the following Dockerfile:
FROM ubuntu:18.04 RUN apt-get update && apt-get install \ -y --no-install-recommends python3 python3-virtualenv ENV VIRTUAL_ENV=/opt/venv RUN python3 -m virtualenv --python=/usr/bin/python3 $VIRTUAL_ENV ENV PATH="$VIRTUAL_ENV/bin:$PATH" # Install dependencies: COPY requirements.txt . RUN pip install -r requirements.txt # Run the application: COPY myapp.py . CMD ["python", "myapp.py"]
The virtualenv now automatically works for both
CMD, without any repetition or need to remember anything.
Software isn’t magic
And there you have it: a version that is as simple as our original, broken version, but actually does the right thing. No repetition, and less scope for error.
When something seems needlessly complex, dig in and figures out how it works. The software you’re using might be simpler (or more simplistic) than you think, and with a little work you might come up with a more elegant solution.
Learn how to build fast, production-ready Docker images—read the rest of the Docker packaging guide for Python.
You’re about to ship your Python application into production using Docker: your images are going to be critical infrastructure. You can’t afford slow builds or security breaches.
But you also can’t afford to waste a week doing research. Developer time is expensive—save money by using the Python on Docker packaging checklist.
Learn practical Docker and Python software engineering skills, every week
You need to stay competitive in the job market—but there's too much to learn, and you don’t know where to start.
Sign up for my newsletter, and join over 1600 Python developers and data scientists learning practical tools and techniques, from Docker packaging to Python best practices, with a free new article in your inbox every week.