Articles: Production-ready Docker packaging for Python developers

Table of Contents

The basics of Docker packaging

  1. Connection refused? Docker networking and how it impacts your image
    Learn how to fix connection refused errors when trying to connect to a Docker container.

  2. Faster or slower: the basics of Docker build caching
    Docker’s layer caching can speed up your image build—if you write your Dockerfile correctly.

  3. Where’s that log file? Debugging failed Docker builds
    Your Docker build just failed, and the reason is buried a log file—which is somewhere inside the build process. How do you read that log file?

  4. Debugging ImportError and ModuleNotFoundErrors in your Docker image
    There are many reasons your Python code might fail to import in Docker. Here’s a quick series of checks you can do to figure out the problem.

  5. A tableau of crimes and misfortunes: the ever-useful docker history
    Use the docker history command to understand how a Docker image is constructed, why an image is too big, and how Dockerfile commands work.

Looking for more? Learn the fundamental concepts of Docker packaging in just one afternoon, by reading my book: Just Enough Docker Packaging.


Free ebook: "Introduction to Dockerizing for Production"

Learn a step-by-step iterative DevOps packaging process in this free mini-ebook. You'll learn what to prioritize, the decisions you need to make, and the ongoing organizational processes you need to start.

Plus, you'll join over 7600 people getting weekly emails covering practical tools and techniques, from Docker packaging to Python best practices.


Best practices for production

The broken status quo

  1. Broken by default: why you should avoid most Dockerfile examples
    Most Dockerfile examples for Python you’ll find on the Web are broken. And that’s a problem.

  2. Reviewing the official Dockerfile best practices: good, bad, insecure
    The official Docker documentation’s Dockerfile best practices are mostly good—but they are sometimes wrong, and if you’re using Python, too generic.

  3. The worst so-called “best practice” for Docker
    Many people (although fewer than in the past) will tell you not to install security updates in your Docker image. This is terrible advice.

Base image and dependencies

  1. The best Docker base image for your Python application (June 2023)
    Ubuntu? Official Python images? Alpine Linux? Here’s how to choose a good base Docker image for your Python application container.

  2. A deep dive into the “official” Docker image for Python
    The “official” Python Docker image is useful, but to actually understand why, and to use it correctly, it’s worth understanding how exactly it’s constructed.

  3. Why you should upgrade pip, and how to do it
    Learn the problem with using old pip, and how to upgrade pip to fix those problems.

  4. Using Alpine can make Python Docker builds 50× slower
    Alpine Linux is often recommended as a smaller, faster Docker base image. But if you’re using Python, it will slow down your build and make your image larger.

  5. When should you upgrade to Python 3.12?
    Python 3.12 has been released—when should you switch to using it?

  6. Building on solid ground: reproducible Docker builds for Python
    Learn how to get reproducible Docker builds for your Python application, including base image, system packages, and Python dependencies.

  7. It’s time to stop using Python 3.7
    Python 3.7 will stop getting security updates in July 2023. Given the existence of 3.8, 3.9, 3.10, and 3.11, you really should upgrade.

  8. CentOS 8 is dead: choosing a replacement Docker image
    CentOS 8 is no longer being maintained as a drop-in replacement for RedHat Enterprise Linux. Here are your options for replacing it as a Docker image.

  9. Push and pull: when and why to update your dependencies
    When should you update your software project’s dependencies? There are two rhythms to updates: security and critical bug fixes, and broader updates.

  10. “Externally managed environments”: when PEP 668 breaks pip
    Getting a externally-managed-environment/PEP 668 error when you pip install? Here’s how to fix it.

Security

  1. Installing system packages in Docker with minimal bloat
    Learn how to minimize your Docker image size while installing or updating system packages on on Debian, Ubuntu, and RHEL.

  2. Avoiding insecure images from Docker build caching
    Docker’s layer caching is great for speeding up builds—but you need to be careful or it’ll cause you to have insecure dependencies.

  3. Less capabilities, more security: preventing Docker escalation attacks
    Reduce the security risk from your Docker image by running as a non-root user and reducing capabilities.

  4. Staying secure by breaking Docker caching
    Even more ways you can ensure Linux distribution security updates in the face of Docker caching.

  5. How to (not) use Docker to share your password with hackers
    Docker images can leak runtime secrets, build secrets, and even just some secret files you have lying around. Learn how to leak them, and how to avoid it.

  6. Don’t leak your Docker image’s build secrets
    When you’re building Docker images you often need some secrets: a password, an SS Hkey. The secure mechanism is BuildKit; others might leak them.

  7. Build secrets in Docker and Compose v1, the secure way
    Builds secrets like passwords may be used to build your Docker image; learn how to use them securely in Docker Compose without leaking them.

  8. Finding leaked secrets in your Docker image with a scanner
    It’s easy to mistakenly embed a secret in your Docker image. Use a scanner to find these secrets before you leak them out potential attackers.

  9. Security scanners for Python and Docker: from code to dependencies
    How do you know your Python code is secure? How about your Docker image? Learn how to catch problems is using security scanners running in your CI setup.

  10. The security scanner that cried wolf
    If you’ve ever been alarmed by how many security vulnerabilities your Docker image has, even after you’ve installed security updates, here’s what’s going on.

Fast builds, small images

  1. The high cost of slow Docker builds
    A slow Docker build on the critical path for developer feedback is a lot more expensive than you think.

  2. Elegantly activating a virtualenv in a Dockerfile
    How to activate a Python virtualenv in a Dockerfile without repeating yourself—plus, you’ll learn what activating a virtualenv actually does.

  3. Faster Docker builds with pipenv, poetry, or pip-tools
    Installing Python dependencies separately from your code speed ups Docker builds. Here’s how to do it with pipenv, poetry, or pip-tools.

  4. Poetry vs. Docker caching: Fight!
    Poetry’s versioning scheme for Python dependencies makes Docker caching harder, which means slower images rebuilds. Learn some workarounds.

  5. Speed up pip downloads in Docker with BuildKit’s new caching
    Every time you change your Python pip requirements and rebuild your Docker image, you’re going to redownload all your packages. You can fix this with BuildKit.

  6. Speeding up Docker builds in CI with BuildKit
    If your CI runners spin up an empty environment, your Docker builds will be slow. Speed up builds by warming the cache, plus BuildKit’s extra speedup.

  7. Making pip installs a little less slow
    Installing packages with pip, Poetry, and Pipenv can be slow. Learn how to ensure it’s not even slower, and a potential speed-up.

  8. Shrinking your Python application’s Docker image: an overview
    Learn the variety of techniques you can use to make your Python application’s Docker image a whole lot smaller.

  9. Multi-stage builds #1: Smaller images for compiled code
    Building Docker images with compiled code can lead to huge images. Learn how to shrink them with multi-stage builds.

  10. Multi-stage builds #2: Python specifics
    Once you understand generic Docker multi-stage builds, here’s how to implement them for Python applications, with virtualenvs or user installs.

  11. Multi-stage builds #3: Speeding up your builds
    Multi-stage Docker image builds give you small images and fast builds, but only if takes extra steps prevent slowness due to caching problems.

Conda

  1. Pip vs Conda: an in-depth comparison of Python’s two packaging systems
    Python has two packaging systems, pip and Conda. Learn the differences between them so you can pick the right one for you.

  2. Activating a Conda environment in your Dockerfile
    Learn how to activate a Conda environment in your Dockerfile.

  3. Shrink your Conda Docker images with conda-pack
    Docker images built for Conda tend to be quite large. Learn how to shrink them by using the conda-pack tool and multi-stage builds.

  4. Scanning your Conda environment for security vulnerabilities
    Learn how to check your Conda environment and packages for security vulnerabilities.

  5. Reproducible and upgradable Conda environments with conda-lock
    You want your packaging to be reproducible, and upgrade Conda dependencies without conflicts. Learn how to do it with a third-party tool: conda-lock.

  6. Speed up your Conda installs with Mamba
    Conda installs are very slow, but you can speed them with a much-faster Conda reimplementation called Mamba.

  7. Using Conda? You might not need Docker
    Conda has some overlap with Docker’s functionality; sometimes you might not need Docker at all.

Applications and runtime

  1. Configuring Gunicorn for Docker
    Running Gunicorn in a Docker container isn’t the same as running on a virtual machine or physical server. Learn what you need to do differently.

  2. What’s running in production? Making your Docker images identifiable
    It’s difficult to debug production problems if you don’t know what image is running in production.

  3. Decoupling database migrations from server startup: why and how
    Migrating your database schema when your application’s Docker container starts up? Here’s some reasons to rethink that choice.

  4. A Python prompt into a running process: debugging with Manhole
    Your Python process is acting strange—learn how to get a live Python interpreter prompt inside your running process for debugging.

  5. Please stop writing shell scripts
    It is quite difficult to write correct shell scripts; you’re much better off just using Python.

  6. Why new Macs break your Docker build, and how to fix it
    New Macs can break your Docker image build in unexpected ways; learn why, and how to fix it.

Packaging as a process

  1. A thousand little details: developing software for ops
    Software for ops suffers both from historical complexity and from problem space complexity. Some generic suggestions, with Docker packaging as an example.

  2. Your Docker build needs a smoke test
    If you don’t test your Docker image before you push it, you’ll waste time (and maybe break production).

  3. “Let’s use Kubernetes!” Now you have 8 problems
    For smaller teams, Kubernetes is usually the wrong solution: too complex, too complicated, and with too much work to keep it running.

Docker variants and alternatives

  1. Docker BuildKit: faster builds, new features, and now it’s stable
    BuildKit is Docker’s new system for building images. It’s faster, has previously missing security featuers, and it’s finally stable.

  2. Options for Python packaging: Wheels, Conda, Docker, and more
    Learn and compare the many ways to package your Python server for distribution: wheels, PEX, RPM/DEB, Conda, executables, Docker.

  3. Docker vs. Singularity for data processing: UIDs and filesystem access
    Containers allow for reproducibility of data processing applications. Docker is the most popular option, but Singularity is also well-suited to this use case.

  4. Using Podman with BuildKit, the better Docker image builder
    Podman is a Docker replacment, and BuildKit is a new builder for Docker images. Learn how to use BuildKit together with Podman.

  5. Building Docker images on GitLab CI: Docker-in-Docker and Podman
    Building Docker images with Gitlab CI can be a little complicated. Learn how to do it with Docker-in-Docker, or the simpler option of using Podman.


Free ebook: "Introduction to Dockerizing for Production"

Learn a step-by-step iterative DevOps packaging process in this free mini-ebook. You'll learn what to prioritize, the decisions you need to make, and the ongoing organizational processes you need to start.

Plus, you'll join over 7600 people getting weekly emails covering practical tools and techniques, from Docker packaging to Python best practices.