Docker can slow down your code and distort your benchmarks
One of the benefits of containers over virtual machines is that you get some measure of isolation without the performance overhead or distortion of virtualization. Docker images therefore seem like a good way to get a reproducible environment for measuring CPU performance of your code.
There are, however, complications. Sometimes, running under Docker can actually slow down your code and distort your performance measurements.
On macOS and Windows, for example, standard Linux-based Docker containers aren’t actually running directly on the OS, since the OS isn’t Linux. And the image filesystem from the container itself is typically mounted with some sort of overlay filesystem, which can slow things down, so for anything I/O bound you want to use a bind-mounted volume.
But even on Linux, with seemingly CPU-only workloads, Docker can distort runtime performance. Let’s see why, and some workarounds.
Slower in Docker… sometimes
The computer I’m testing on is running Fedora 33, and has Docker 20.10.6; I’ve disabled some operating system and CPU features that can make benchmarks less consistent (ASLR and turboboost).
I’m going to compare running some code on my machine to code inside a container, and so for maximum realism I’m going to use the
First, let’s test a tiny Rust program that just does some floating point calculations:
$ ./benchmark Elapsed: 921ms, result: 499999999067109000 $ docker run -v $PWD:/code fedora:33 /code/benchmark Elapsed: 915ms, result: 499999999067109000
Some runs were slower; I picked the fastest ones. As a first approximation it seems like the performance was the same in and out of Docker.
Next, let’s try a Python program that again only does some computation. I’ve chosen the fastest runs:
$ python3.9 pystone.py Pystone(1.2) time for 50000 passes = 0.248776 This machine benchmarks at 200984 pystones/second $ docker run -v $PWD:/code fedora:33 python3.9 /code/pystone.py Pystone(1.2) time for 50000 passes = 0.297675 This machine benchmarks at 167968 pystones/second
In this case Python performance is about 16% slower when using Docker.
Even worse, we can see the performance hit is inconsistent: our tiny little Rust benchmark was unaffected by Docker, but the Python benchmark was slower. If the slowdown was always consistent, running everything in Docker would at least let us reliably measure relative performance, for example between two versions of some code. Inconsistent slowdowns mean Docker is distorting our results.
The cost of security
Now, containers don’t inherently have performance overhead: the whole point is that other than having different namespaces for things like networking or user IDs, a process in a container is just another process like any other.
So where’s the performance hit coming from? One plausible theory suggested by Aras Abbasi is that it’s Docker’s security features.
Docker originated in the world of platform-as-a-service, where applications from different users are running exposed to the world. So Docker also adds additional layers of security to prevent programs escaping from the container to the host.
- One of these security mechanisms is
seccomp, which Docker uses to constrain what system calls containers can run.
- Older versions of
seccomphave a performance problem that can slow down operations.
- Docker still hasn’t enabled this performance fix.
There may of course be other
seccomp performance issues that are causing the problem, or one of the other security mechanisms that Docker uses, but we can at least test this general theory by running our Docker container in privileged mode.
This disables all the security features, and so if those are responsible for the slowdown we should get our speed back:
$ docker run --privileged -v $PWD:/code fedora:33 python3.9 /code/pystone.py Pystone(1.2) time for 50000 passes = 0.239254 This machine benchmarks at 208983 pystones/second
The code is no longer slower than the host.
And yes, I’ve run both variants many times: performance always goes back to normal when running with
More details: avoiding
--privileged, and CPU impacts
The problem with
--privileged is that it gives the container lots of security privileges.
So we can also try just disabling
seccomp, which is less all-encompassing of a security reduction.
And indeed, the following works just as well as
--privileged to get back performance:
$ docker run --security-opt seccomp=unconfined -v $PWD:/code fedora:33 python3.9 /code/pystone.py
Additional searching found this article that points the finger at security measures to prevent Spectre side-channel attacks. It therefore has some more fine-grained suggestions on how to fix this; this also implies this might be less of an issue on newer CPUs that have hardware fixes.
And in fact, the tests above were done on a Haswell Xeon (circa 2014). Testing on a 12th generation Alder Lake CPU (circa 2021) shows no slowdown when running under Docker. This may or may not be because of the CPU, however. The Haswell machine I originally tested on is no longer showing the problem (I’ve upgraded the OS and Docker since then), but I can still reproduce it on a Skylake machine from 2015 running Ubuntu 22.04.
In short, you may have different results, depending on kernel version, CPU, Docker version, and likely other factors; make sure to test.
Benchmarking is hard
So should you run your benchmarks with
If you’re running your code on a containerized platform, the default Docker configuration might actually match reality better. Then again, it might not. Different CPUs, Linux versions, container runtimes, and so on and so forth might behave differently.
Your best bet: compare measurements across different environments, work out the differences, and aim for maximum realism for your particular situation.
Find performance and memory bottlenecks in your data processing code with the Sciagraph profiler
Slow-running jobs waste your time during development, impede your users, and increase your compute costs. Speed up your code and you’ll iterate faster, have happier users, and stick to your budget—but first you need to identify the cause of the problem.
Find performance bottlenecks and memory hogs in your data science Python jobs with the Sciagraph profiler. Profile in development and production, with multiprocessing support, on macOS and Linux, with built-in support for Jupyter notebooks.
Learn practical Python software engineering skills you can use at your job
Sign up for my newsletter, and join over 6900 Python developers and data scientists learning practical tools and techniques, from Python performance to Docker packaging, with a free new article in your inbox every week.