Docker can slow down your code and distort your benchmarks
One of the benefits of containers over virtual machines is that you get some measure of isolation without the performance overhead or distortion of virtualization. Docker images therefore seem like a good way to get a reproducible environment for measuring CPU performance of your code.
There are, however, complications. Sometimes, running under Docker can actually slow down your code and distort your performance measurements.
On macOS and Windows, for example, standard Linux-based Docker containers aren’t actually running directly on the OS, since the OS isn’t Linux. And the image filesystem from the container itself is typically mounted with some sort of overlay filesystem, which can slow things down, so for anything I/O bound you want to use a bind-mounted volume.
But even on Linux, with seeminly CPU-only workloads, Docker can distort runtime performance. Let’s see why, and some workarounds.
Slower in Docker… sometimes
The computer I’m testing on is running Fedora 33, and has Docker 20.10.6; I’ve disabled some operating system and CPU features that can make benchmarks less consistent (ASLR and turboboost).
I’m going to compare running some code on my machine to code inside a container, and so for maximum realism I’m going to use the
First, let’s test a tiny Rust program that just does some floating point calculations:
$ ./benchmark Elapsed: 921ms, result: 499999999067109000 $ docker run -v $PWD:/code fedora:33 /code/benchmark Elapsed: 915ms, result: 499999999067109000
Some of the runs were slower; I picked the fastest ones. As a first approximation it seems like the performance was the same in and out of Docker.
Next, let’s try a Python program that again only does some computation. I’ve chosen the fastest runs:
$ python3.9 pystone.py Pystone(1.2) time for 50000 passes = 0.248776 This machine benchmarks at 200984 pystones/second $ docker run -v $PWD:/code fedora:33 python3.9 /code/pystone.py Pystone(1.2) time for 50000 passes = 0.297675 This machine benchmarks at 167968 pystones/second
In this case Python performance is about 16% slower when using Docker.
Even worse, we can see the performance hit is inconsistent: our tiny little Rust benchmark was unaffected by Docker, but the Python benchmark was slower. If the slowdown was always consistent, running everything in Docker would at least let us reliably measure relative performance, for example between two versions of some code. Inconsistent slowdowns mean Docker is distorting our results.
The cost of security
Now, containers don’t inherently have performance overhead: the whole point is that other than having different namespaces for things like networking or user IDs, a process in a container is just another process like any other.
So where’s the performance hit coming from? One plausible theory suggested by Aras Abbasi is that it’s Docker’s security features.
Docker originated in the world of platform-as-a-service, where applications from different users are running exposed to the world. So Docker also adds additional layers of security to prevent programs escaping from the container to the host.
- One of these security mechanisms is
seccomp, which Docker uses to constrain what system calls containers can run.
- Older versions of
seccomphave a performance problem that can slow down operations.
- Docker still hasn’t enabled this performance fix.
There may of course be other
seccomp performance issues that are causing the problem, or one of the other security mechanisms that Docker uses, but we can at least test this general theory by running our Docker container in privileged mode.
This disables all the security features, and so if those are responsible for the slowdown we should get our speed back:
$ docker run --privileged -v $PWD:/code fedora:33 python3.9 /code/pystone.py Pystone(1.2) time for 50000 passes = 0.239254 This machine benchmarks at 208983 pystones/second
The code is no longer slower than the host.
And yes, I’ve run both variants many times: performance always goes back to normal when running with
Benchmarking is hard
So should you run your benchmarks with
--privileged, assumed you trust the code you’re running to run as
If you’re running your code on a containerized platform, the default Docker configuration might actually match reality better. Then again, it might not. For example, the Podman reimplementation of Docker doesn’t have this problem, and Kubernetes uses a different container runtime.
Your best bet: compare measurements across different environments, work out the differences, and aim for maximum realism for your particular situation.
There's always more to learn in the Python world, and it's easy to let your skills slip behind. Learn how you can upgrade your team’s skills—and ship faster—with my training classes.
Learn practical Python software engineering skills you can use at your job
Too much to learn? Don't know where to start?
Sign up for my newsletter, and join over 2800 Python developers and data scientists learning practical tools and techniques, from Docker packaging to testing to Python best practices, with a free new article in your inbox every week.