Speed up your Conda installs with Mamba

Conda installs can be very very very slow. Every time you run conda install:

  1. It has to collect the package metadata.
  2. It has to solve the environment. … maybe you can take a coffee break here, or go work on a jigsaw puzzle to relax …
  3. It has to download packages.
  4. Eventually, finally, it will install the packages it downloaded.

By the time this is all done you’ve probably forgotten what it was you were trying to do in the first place. To be fair, Conda has gotten faster in the past few releases, but it’s still far from being fast.

Luckily, a new project called Mamba has set out to reimplement Conda functionality while running much faster. So let’s see:

  • How much faster Mamba is.
  • How to switch to Mamba.
  • Using it in Docker to make image builds even faster.

Measuring Conda vs. Mamba install speed

Let’s compare installing the following environment.yml with both Conda and Mamba:

name: myenv
channels:
  - conda-forge
dependencies:
  - python=3.9
  - matplotlib
  - pandas
  - scipy

In order to make sure there’s no caching messing with the results, we’ll do this as part of a Docker build. First, a normal Conda install:

# Dockerfile.just-conda
FROM continuumio/miniconda3
COPY environment.yml .
RUN /bin/bash -c "time conda env create -f environment.yml"

Running this, I got the following output from the time utility for running the conda env create:

real    2m15.143s
user    1m38.642s
sys     0m6.421s

If you’re not familiar with the output of time, basically that’s saying 2:15 minutes elapsed in clock time, and 1:45 minutes of computation. To learn more see my article on CPU vs. clock time.

Next, we’ll install Mamba (conda install -c conda-forge mamba) and time creating the environment using Mamba:

# Dockerfile.conda-then-mamba
FROM continuumio/miniconda3
COPY environment.yml .
RUN conda install -c conda-forge mamba
RUN /bin/bash -c "time mamba env create -f environment.yml"

Here’s the result:

real    0m47.090s
user    0m28.493s
sys     0m4.692s

Mamba installs these packages in only a third of the time that Conda does. Much of that is due to less CPU usage, but even network downloads seem to be little faster; Mamba uses parallel downloads to speed them up.

Note: Outside any specific best practice being demonstrated, the Dockerfiles in this article are not examples of best practices, since the added complexity would obscure the main point of the article.

Python on Docker Production Handbook Make sure your production software is packaged securely, efficiently, and quickly: Read the pragmatic, thorough, and concise Python on Docker Production Handbook.

Introducing Mamba

Mamba is a re-implementation of the Conda package manager, designed to be:

  • Fast.
  • Backwards compatible, with the same command-line options.
  • Eventually, add more features.

If you look back at the two Dockerfiles above, you’ll notice that once Mamba was installed, all you had to was replace conda with mamba in the command-line. That’s true in general, and you can use Mamba for all your other Conda environment interactions.

Mamba has been in development since March 2019, has had 1.5 million downloads since then, and at least in my testing of environment creation seems to work just fine.

Installing Mamba

  • You can install Mamba into a specific Conda environment as we did above, with conda install -c conda-forge mamba.
  • If this is your development machine, you’ll want to do conda install mamba -n base -c conda-forge so it’s available in all environments.
  • If you’re setting up a new development machine, and you’re primarily using Conda-Forge, there is an another option. Conda-Forge provides an alternative to the normal miniconda installer.

This alternative installer:

  1. Uses conda-forge as the default channel, instead of Anaconda’s commercially supported default channel.
  2. Optionally, comes pre-packaged with Mamba.

Speeding up Docker builds a little bit more

Looking back at the Dockerfile using mamba above, it still has one caveat:

# Dockerfile.conda-then-mamba
FROM continuumio/miniconda3
COPY environment.yml .
RUN conda install -c conda-forge mamba  # <-- STILL SLOW
RUN /bin/bash -c "time mamba env create -f environment.yml"

We’re still using conda to get Mamba installed, which means that one install is still going to slow us down. What to do?

A base image with Mamba pre-installed

As mentioned above, Conda-Forge has an installer that comes with Mamba pre-installed… and they also provide Docker images. Which means we can use the following Dockerfile instead:

# Dockerfile.just-mamba
FROM condaforge/mambaforge
COPY environment.yml .
RUN mamba env create -f environment.yml

No need to install mamba separately!

Let’s see the end-to-end speed of the two ways of using Mamba. I’m going to measure the full speed of the Docker build, not including the time it takes to download the base image; in practice they’re both about the same size so that probably wouldn’t affect the results either.

First, here’s is our first attempt, which first installs Mamba and then uses Mamba to create the environment:

$ time docker build -q --no-cache -f Dockerfile.conda-then-mamba .
sha256:9dc3e8d04ccf58862aa172a944d8010569c31b196e047d644ee2341816026282

real    1m21.384s
user    0m0.019s
sys     0m0.014s

Second, here’s our second attempt, using the condaforge/mambaforge image:

$ time docker build -q --no-cache -f Dockerfile.just-mamba .
sha256:5f9f44818d78b0db6f281007ad4a456c06b37245bb33b2f5c3a5af95b38a3f6c

real    0m53.383s
user    0m0.020s
sys     0m0.013s

So it looks like using the Docker base image with Mamba pre-installed saves us about 25 seconds. This isn’t bad, but it isn’t quite as exciting as the speed-up from switching to Mamba in the first place:

  • Unlike the speedup from switching, which will probably scale at least somewhat with number of packages, this is a fixed overhead: you’re just saving the one-time RUN conda install -c conda-forge mamba.
  • On your development computer, you can just have Mamba installed in the base environment (see above), so there isn’t really much savings.
  • In the context of Docker builds, the fixed cost of RUN conda install -c conda-forge mamba will go away after the first build if you’re using Docker layer caching, which you should be.

Use Mamba!

Whether developing on your machine or packaging with Docker, you should use Mamba to install your Conda packages. It’s a whole lot faster—with the same functionality.