What’s running in production? Making your Docker images identifiable
You’ve just gotten a bug report from production, and you want to reproduce it on your local development machine. How do you make sure you’re running the exact same version of the production code on your local machine?
Using Docker images makes it easier, since you can download and run the same image. But that only works if you know what image is running in production.
In this article we’ll cover some of the ways you can make your Docker image identifiable and retrievable: tags, labels, and more.
Make it retrievable: storing tags on your image
Image names and their tags are not immutable; over time they can point to different images.
So for example as I’m writing this article the
python:3.7 image includes Python 3.7.4, but in the past it included Python 3.7.3, before that 3.7.2, and so on.
That means if your production environment is running a container based off the Docker image
yourapplication:latest, that can be different actual images at different times.
So the first thing to do is store tags that allow retrieving an image by its contents. Since you can have multiple tags for an image, you can in particular use the git hash, and then never overwrite that particular tag.
Let’s make a new git repository:
$ cd /tmp $ git init myrepo Initialized empty Git repository in /tmp/myrepo/.git/ $ cd myrepo
And add a
Dockerfile in there:
FROM centos ENTRYPOINT ["echo"]
And then commit it to the repository:
$ git add Dockerfile $ git commit -m "my first commit" $ git log commit 45eefe34303296b0d8ed8e37c066ec9d676e1432 (HEAD -> master) Author: Itamar Turner-Trauring <email@example.com> Date: Tue Sep 24 13:11:19 2019 -0400 my first commit
When we build the resulting image can give it multiple tags, including the git hash and branch:
$ GIT_COMMIT=$(git rev-parse --short HEAD) $ GIT_BRANCH=$(git rev-parse --abbrev-ref HEAD) $ echo $GIT_COMMIT $GIT_BRANCH 45eefe3 master $ docker build -t myimage:commit-$GIT_COMMIT -t myimage:branch-$GIT_BRANCH . $ docker image ls myimage REPOSITORY TAG IMAGE ID CREATED SIZE myimage branch-master f3119f0b1743 About an hour ago 202MB myimage commit-45eefe3 f3119f0b1743 About an hour ago 202MB
These tags can be pushed to the registry, so you can retrieve images by git commit or branch. The branch tag won’t be stable, but that’s OK, we’ll still have the commit tag.
Make it identifiable: via Docker
The problem with tags is that they’re not embedded into the image.
So if you deployed
yourimage:latest, you won’t know what other tags it used to have.
One solution is to embed the metadata as labels inside the image itself:
docker build -t myimage:latest --label git-commit=$GIT_COMMIT .
And you can use
docker inspect to look at the label:
$ docker inspect myimage | grep "git-commit" "LABEL git-commit=45eefe3" "git-commit": "45eefe3", "git-commit": "45eefe3",
Make it identifiable: via logs and public API
You can also use build arguments to customize the build; this allows you to pass in the git commit, store it in the image as a file, and then your application can include it in a status API endpoint, or as part of application logging on startup.
FROM centos ARG git_commit RUN echo $git_commit > /git-commit.txt
ARG is a build argument you expect the image to take.
And then we can pass it in:
$ docker build -t myimage --build-arg git_commit=$GIT_COMMIT . $ docker run -it myimage $ docker run -ti myimage [root@6d9d99b56d9b /]# cat /git-commit.txt 45eefe3
Your code can then load
git-commit.txt and put it in logs, or include as part of a
/status API endpoint.
To make it easy to identify images you’re using in production:
- Store labels and on-disk data to make images identifiable via git commit and branch.
- Push tags for each image with git commit and branch so you can retrieve images later on once you’ve identified what’s running in production.
Learn how to build fast, production-ready Docker images—read the rest of the Docker packaging guide for Python.
You’re about to ship your Python application into production using Docker: your images are going to be critical infrastructure. You can’t afford slow builds or security breaches.
But you also can’t afford to waste a week doing research. Developer time is expensive—save money by using the Python on Docker packaging checklist.
Learn practical Docker and Python software engineering skills, every week
You need to stay competitive in the job market—but there's too much to learn, and you don’t know where to start.
Sign up for my newsletter, and join over 1600 Python developers and data scientists learning practical tools and techniques, from Docker packaging to Python best practices, with a free new article in your inbox every week.