What’s running in production? Making your Docker images identifiable
You’ve just gotten a bug report from production, and you want to reproduce it on your local development machine. How do you make sure you’re running the exact same version of the production code on your local machine?
Using Docker images makes it easier, since you can download and run the same image. But that only works if you know what image is running in production.
In this article we’ll cover some of the ways you can make your Docker image identifiable and retrievable: tags, labels, and more.
Make it retrievable: storing tags on your image
Image names and their tags are not immutable; over time they can point to different images.
So for example as I’m writing this article the
python:3.7 image includes Python 3.7.4, but in the past it included Python 3.7.3, before that 3.7.2, and so on.
That means if your production environment is running a container based off the Docker image
yourapplication:latest, that can be different actual images at different times.
So the first thing to do is store tags that allow retrieving an image by its contents. Since you can have multiple tags for an image, you can in particular use the git hash, and then never overwrite that particular tag.
Let’s make a new git repository:
$ cd /tmp $ git init myrepo Initialized empty Git repository in /tmp/myrepo/.git/ $ cd myrepo
And add a
Dockerfile in there:
FROM centos ENTRYPOINT ["echo"]
And then commit it to the repository:
$ git add Dockerfile $ git commit -m "my first commit" $ git log commit 45eefe34303296b0d8ed8e37c066ec9d676e1432 (HEAD -> master) Author: Itamar Turner-Trauring <firstname.lastname@example.org> Date: Tue Sep 24 13:11:19 2019 -0400 my first commit
When we build the resulting image can give it multiple tags, including the git hash and branch:
$ GIT_COMMIT=$(git rev-parse --short HEAD) $ GIT_BRANCH=$(git rev-parse --abbrev-ref HEAD) $ echo $GIT_COMMIT $GIT_BRANCH 45eefe3 master $ docker build -t myimage:commit-$GIT_COMMIT -t myimage:branch-$GIT_BRANCH . $ docker image ls myimage REPOSITORY TAG IMAGE ID CREATED SIZE myimage branch-master f3119f0b1743 About an hour ago 202MB myimage commit-45eefe3 f3119f0b1743 About an hour ago 202MB
These tags can be pushed to the registry, so you can retrieve images by git commit or branch. The branch tag won’t be stable, but that’s OK, we’ll still have the commit tag.
Make it identifiable: via Docker
The problem with tags is that they’re not embedded into the image.
So if you deployed
yourimage:latest, you won’t know what other tags it used to have.
One solution is to embed the metadata as labels inside the image itself:
docker build -t myimage:latest --label git-commit=$GIT_COMMIT .
And you can use
docker inspect to look at the label:
$ docker inspect myimage | grep "git-commit" "LABEL git-commit=45eefe3" "git-commit": "45eefe3", "git-commit": "45eefe3",
Make it identifiable: via logs and public API
You can also use build arguments to customize the build; this allows you to pass in the git commit, store it in the image as a file, and then your application can include it in a status API endpoint, or as part of application logging on startup.
FROM centos ARG git_commit RUN echo $git_commit > /git-commit.txt
ARG is a build argument you expect the image to take.
And then we can pass it in:
$ docker build -t myimage --build-arg git_commit=$GIT_COMMIT . $ docker run -it myimage $ docker run -ti myimage [root@6d9d99b56d9b /]# cat /git-commit.txt 45eefe3
Your code can then load
git-commit.txt and put it in logs, or include as part of a
/status API endpoint.
To make it easy to identify images you’re using in production:
- Store labels and on-disk data to make images identifiable via git commit and branch.
- Push tags for each image with git commit and branch so you can retrieve images later on once you’ve identified what’s running in production.
Learn how to build fast, production-ready Docker images—read the rest of the Docker packaging guide for Python.
Production Docker packaging is too complicated to learn from Google searches
With as much as a dozen different intersecting technologies, and an unknown number of details to get right, Docker packaging isn't simple, especially for production.
But you still need fast builds that save you time, and security best practices that keep you safe.
Take the fast path to learning best practices, by using the Python on Docker Production Handbook.
Learn practical Python software engineering skills you can use at your job
Too much to learn? Don't know where to start?
Sign up for my newsletter, and join over 2600 Python developers and data scientists learning practical tools and techniques, from Docker packaging to testing to Python best practices, with a free new article in your inbox every week.