Introduction to Dockerizing for Production

Improve your DevOps skills: learn an iterative process for Dockerizing your code.

What’s running in production? Making your Docker images identifiable

by Itamar Turner-Trauring
Last updated 13 Sep 2024, originally created 30 Sep 2019

You’ve just gotten a bug report from production, and you want to reproduce it on your local development machine. How do you make sure you’re running the exact same version of the production code on your local machine?

Using Docker images makes it easier, since you can download and run the same image. But that only works if you know what image is running in production.

In this article we’ll cover some of the ways you can make your Docker image identifiable and retrievable: tags, labels, and more.

Make it retrievable: storing tags on your image

Image names and their tags are not immutable; over time they can point to different images. So for example as I’m writing this article the python:3.12 image includes Python 3.12.6, but in the past it included Python 3.12.5, before that 3.12.4, and so on.

That means if your production environment is running a container based off the Docker image yourapplication:latest, that can be different actual images at different times.

So the first thing to do is store tags that allow retrieving an image by its contents. Since you can have multiple tags for an image, you can in particular use the git hash, and then never overwrite that particular tag.

Let’s make a new git repository:

$ cd /tmp
$ git init myrepo
Initialized empty Git repository in /tmp/myrepo/.git/
$ cd myrepo

And add a Dockerfile in there:

FROM debian:12
ENTRYPOINT ["echo"]

And then commit it to the repository:

$ git add Dockerfile 
$ git commit -m "my first commit"
$ git log
commit 45eefe34303296b0d8ed8e37c066ec9d676e1432 (HEAD -> master)
Author: Itamar Turner-Trauring <itamar@itamarst.org>
Date:   Tue Sep 24 13:11:19 2019 -0400

    my first commit

When we build the resulting image can give it multiple tags, including the git hash and branch:

$ GIT_COMMIT=$(git rev-parse --short HEAD)
$ GIT_BRANCH=$(git rev-parse --abbrev-ref HEAD)
$ echo $GIT_COMMIT $GIT_BRANCH
45eefe3 master
$ docker build -t myimage:commit-$GIT_COMMIT -t myimage:branch-$GIT_BRANCH .
$ docker image ls myimage
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
myimage             branch-master       f3119f0b1743        About an hour ago   202MB
myimage             commit-45eefe3      f3119f0b1743        About an hour ago   202MB

These tags can be pushed to the registry, so you can retrieve images by git commit or branch. The branch tag won’t be stable, but that’s OK, we’ll still have the commit tag.

Make it identifiable: via Docker

The problem with tags is that they’re not embedded into the image. So if you deployed yourimage:latest, you won’t know what other tags it used to have.

One solution is to embed the metadata as labels inside the image itself:

docker build -t myimage:latest --label git-commit=$GIT_COMMIT .

And you can use docker inspect to look at the label:

$ docker inspect myimage | grep "git-commit"
                "LABEL git-commit=45eefe3"
                "git-commit": "45eefe3",
                "git-commit": "45eefe3",

Make it identifiable: via logs and public API

You can also use build arguments to customize the build; this allows you to pass in the git commit, store it in the image as a file, and then your application can include it in a status API endpoint, or as part of application logging on startup.

FROM debian:12
ARG git_commit
RUN echo $git_commit > /git-commit.txt

The ARG is a build argument you expect the image to take.

And then we can pass it in:

$ docker build -t myimage --build-arg git_commit=$GIT_COMMIT .
$ docker run -it myimage
$ docker run -ti myimage
[root@6d9d99b56d9b /]# cat /git-commit.txt 
45eefe3

Your code can then load git-commit.txt and put it in logs, or include as part of a /status API endpoint.

Summary

To make it easy to identify images you’re using in production:

Store labels and on-disk data to make images identifiable via git commit and branch.
Push tags for each image with git commit and branch so you can retrieve images later on once you’ve identified what’s running in production.

The concise and action-oriented guide to Docker packaging for production

Docker packaging for production is complicated, with as many as 70+ best practices to get right. And you want small images, fast builds, and your Python application running securely.

Take the fast path to learning best practices, by using the Python on Docker Production Handbook.

Free ebook: "Introduction to Dockerizing for Production"

Learn a step-by-step iterative DevOps packaging process in this free mini-ebook. You'll learn what to prioritize, the decisions you need to make, and the ongoing organizational processes you need to start.

Plus, you'll join over 8000 people getting weekly emails covering practical tools and techniques, from Docker packaging to Python best practices.