Production-ready Docker packaging for Python developers

Table of Contents

Articles: The basics of Docker packaging

  1. Connection refused? Docker networking and how it impacts your image
    A command that runs fine on your computer may fail with connection refused when run in a container. You’ll learn why that happens, and how to prevent it.

  2. Faster or slower: the basics of Docker build caching
    Docker’s layer caching can speed up your image build—if you write your Dockerfile correctly.

  3. Where’s your code? Debugging ImportError and ModuleNotFoundErrors in your Docker image
    There are many reasons your code might fail to import in Docker. Here’s a quick series of checks you can do to figure out the problem.

  4. A tableau of crimes and misfortunes: the ever-useful docker history
    Whether it’s understanding how a base image is constructed, figuring out why your image is too big, or understanding how Dockerfile commands work, docker history should be your go-to command.

Looking for more? Learn the fundamental concepts of Docker packaging in just one afternoon, by reading my book: Just Enough Docker Packaging.

Articles: Best practices for production

The broken status quo

  1. Broken by default: why you should avoid most Dockerfile examples
    Most Dockerfile examples you’ll find on the Web are broken. And that’s a problem.

  2. A review of the official Dockerfile best practices: good, bad, and insecure
    The official Docker documentation’s Dockerfile best practices are mostly good—but they omit some important information.

Base image and dependencies

  1. The best Docker base image for your Python application (April 2020)
    Ubuntu? Official Python images? Alpine Linux? Here’s how to choose a good base image.

  2. A deep dive into the official Docker image for Python
    The official Python Docker image is useful, but to actually understand why, and to use it correctly, it’s worth understanding how exactly it’s constructed.

  3. Using Alpine can make Python Docker builds 50× slower
    Alpine Linux is often recommended as a smaller, faster base image. But if you’re using Python, it will actually do the opposite.

  4. When to switch to Python 3.9
    Python 3.9 is out now—when should you start using it?

  5. Building on solid ground: ensuring reproducible Docker builds for Python
    You want to be able to rebuild the same Docker image from the same source code across different computers and different points in time. Here’s how to get reproducible builds.

  6. Push and pull: when and why to update your dependencies
    When should you update your project’s dependencies? There are two rhythms to updates: security and critical bug fixes, and broader updates.

Security

  1. Installing system packages in Docker with minimal bloat
    Your Docker build needs to update system packages for security, and perhaps to install them for additional dependencies. Here’s how to do it without making your image too large, on Debian, Ubuntu, CentOS and RHEL.

  2. Less capabilities, more security: minimizing privilege escalation in Docker
    To reduce the security risk from your Docker image, you should run it as a non-root user. You should also reduce it capabilities: learn what, why, and how.

  3. Avoiding insecure images from Docker build caching
    Docker’s layer caching is great for speeding up builds—but you need to be careful or it’ll cause you to have insecure dependencies.

  4. Docker build secrets: the easy way, the wrong way, the sneaky way
    When you’re building Docker images you often need some secrets: a password, an SSH key. For now, Docker lacks a good mechanism to pass in secrets in a secure way, which means you need to get sneaky.

  5. Build secrets in Docker Compose, the secure way
    Builds secrets are used to build your image, but if you’re not careful they will get embedded into your image, leaking them to adversaries. Learn how to use build secrets in Docker Compose without leaking your secrets.

  6. Security scanners for Python and Docker: from code to dependencies
    How do you know your code is secure? How about your Docker image? Learn how to catch problems is using security scanners running in your CI setup.

Fast builds, small images

  1. The high cost of slow Docker builds
    A slow Docker build on the critical path for developer feedback is a lot more expensive than you think.

  2. Faster Docker builds with pipenv, poetry, or pip-tools
    Installing dependencies separately from your code allows you to take advantage of Docker’s layer caching. Here’s how to do it with pipenv, poetry, or pip-tools.

  3. Elegantly activating a virtualenv in a Dockerfile
    How to activate a virtualenv in a Dockerfile without repeating yourself—plus, you’ll learn what activating a virtualenv actually does.

  4. Multi-stage builds #1: Smaller images for compiled code
    You’re building a Docker image for a Python project with compiled code (C/C++/Rust/whatever), and somehow without quite realizing it you’ve created a Docker image that is 917MB… only 1MB of which is your code!

  5. Multi-stage builds #2: Python specifics—virtualenv, –user, and other methods
    Now that you understand multi-stage builds, here’s how to implement them for Python applications.

  6. Multi-stage builds #3: Why your build is surprisingly slow, and how to speed it up
    Multi-stage builds give you small images and fast builds—in theory. In practice, they require some tricks if you want your builds to actually be fast.

Applications and runtime

  1. Configuring Gunicorn for Docker
    Running in a container isn’t the same as running on a virtual machine or physical server: you need to configure Gunicorn (and other servers) appropriately.

  2. Activating a Conda environment in your Dockerfile
    Learn how to activate a conda environment in your Dockerfile.

  3. Shrink your Conda Docker images with conda-pack
    Docker images built for Conda tend to be quite large. Learn how to shrink them by using the conda-pack tool and multi-stage builds.

  4. What’s running in production? Making your Docker images identifiable
    It’s difficult to debug production problems if you don’t know what image is running in production.

  5. Decoupling database migrations from server startup: why and how
    It’s tempting to migrate your database schema when your application container starts up—here’s some reasons to rethink that choice.

  6. A Python prompt into a running process: debugging with Manhole
    Your Python process is acting strange—wouldn’t it be useful to get a live Python interpreter prompt inside your running process?

Packaging as a process

  1. A thousand little details: developing software for ops
    Some thoughts on how to build software for ops, a domain that suffers from historical complexity and problem space complexity. And in particular, building a better way to do Docker packaging.

  2. Your Docker build needs a smoke test
    If you don’t test your Docker image before you push it, you’ll waste time (and maybe break production).

  3. Where’s that log file? Debugging failed Docker builds
    Your Docker build just failed, and the reason is buried a log file—which is somewhere inside the build process. How do you read that log file?

  4. “Let’s use Kubernetes!” Now you have 8 problems
    For smaller teams, Kubernetes is usually the wrong solution.

Docker alternatives

  1. Docker vs. Singularity for data processing: UIDs and filesystem access
    For data processing, where you just read in some files and then write them out, containers provide reproducibility. Docker may not be the ideal container implementation for this use case, however, so we’ll also take a look at Singularity.

  2. Options for packaging your Python code: Wheels, Conda, Docker, and more
    There are a variety of ways of packaging your Python application for distribution, from wheels to Docker. This article gives a survey of the different approaches, specifically focusing on distributing internal server applications.

Products and services

Learn how Docker packaging works, in just one afternoon

New to Docker? Learn the fundamental concepts and the practical debugging techniques you need to understand Docker packaging—in just one afternoon—by reading Just Enough Docker Packaging.

Everything you need to know to Dockerize for production

Quickly learn how to make your Python application’s Docker packaging production-ready. You’ll get a step-by-step plan, and a reference covering 60+ best practices, including security, fast builds, small images, reproducability, and much more: Python on Docker Production Quickstart.

From zero to production-ready Docker image in just 3 hours

Instead of wasting days of expensive developer time implementing and testing your own Docker packaging infrastructure, you can ship your Docker images with confidence—in just hours!—by using the Production-Ready Python Containers template.

And if you need Conda support, I’m working on a new Conda-specific template.

Remote corporate training

Make you team more productive by upgrading their skills. Whether it’s the basics of Docker packaging, or the best practices you need to run in production, consider one of my Docker packaging training courses.


Keep your job skills sharp

Join over 2300 Python developers and data scientists learning practical tools and techniques in your inbox every week.