The easiest way to speed up Python with Rust

If you want to speed up some existing Python code, writing a compiled extension in Rust can be an excellent choice:

  • In many situations, Rust code can run much faster than Python.
  • Rust prevents most of the memory-management bugs that often occur in C, C++, and Cython code.
  • There is a growing ecosystem of third-party Rust packages available, and unlike C and C++ it also has a built-in package manager and build system.

However, if you just want to prototype a Rust extension, packaging and integration boilerplate can get in the way: every extra bit of friction prevents you from just doing the experiment of whether or not Rust will help.

This is where rustimport comes in, a library that makes standalone Rust files easily importable in Python (currently only on Linux and macOS). In this article we’ll cover:

  • How to use rustimport to quickly try out your Rust code.
  • The most common performance mistake Rust beginners make, and how to avoid it.
  • Some gotchas when using rustimport.

Note: As is typically the case for Rust extensions, you’ll still be using PyO3 to bridge Python and Rust, but as we’ll see rustimport makes it extra easy to use PyO3 compared to alternative packaging methods.

Prerequisites

rustimport compiles your Rust code on the fly, so you’re going to need to have a Rust compiler installed on your computer. This is not what you want for your final packaged library or application, but our use case is prototyping, where you’d need a compiler regardless.

You also need to have rustimport installed, for example by running pip install rustimport in a virtualenv or Conda environment.

Example: Calculating the Fibonacci sequence

We’re going to implement the Fibonacci sequence in Rust, but we’ll start by implementing it in Python so we have a basis for comparing the performance:

def fibonacci(number: int) -> int:
    if number == 0:
        return 0
    if number == 1:
        return 1
    prevprev = 0
    prev = 1
    current = 1
    for _ in range(number - 1):
        current = prevprev + prev
        prevprev = prev
        prev = current
    return current

We can then time this implementation using the IPython interactive prompt’s %timeit magic:

In [1]: from pyfib import fibonacci

In [2]: %timeit fibonacci(50)
954 ns ± 1.66 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Implementing the Fibonacci sequence in Rust

Let’s implement the same function in Rust, and expose it to Python. We’ll create a new standalone file, rustfib.rs, that looks like this:

// rustimport:pyo3

// PyO3 is a Rust library for writing Python extensions;
// we'll import its most commonly used APIs.
use pyo3::prelude::*;

#[pyfunction]  // ← expose the function to Python
fn fibonacci(number: u64) -> u64 {
    if number == 0 {
        return 0;
    }
    if number == 1 {
        return 1;
    }
    let mut prevprev = 0;
    let mut prev = 1;
    let mut current = 1;
    for _ in 0..(number - 1) {
        current = prevprev + prev;
        prevprev = prev;
        prev = current;
    }
    current
}

We’ll go over much of what this means in a little bit, but before that let’s see how it runs.

By default, if we try importing a Rust file in a Python prompt, it won’t work. Python doesn’t know how to import files ending with .rs:

In [1]: from rustfib import fibonacci
ModuleNotFoundError: No module named 'rustfib'

This where rustimport comes in. As the name implies, it lets you import Rust files; all you have to do is import rustimport.import_hook first.

In [2]  import rustimport.import_hook

In [3]: from rustfib import fibonacci
    Updating crates.io index
  Downloaded proc-macro2 v1.0.64
  Downloaded 1 crate (44.8 KB) in 0.32s
   Compiling target-lexicon v0.12.8
   Compiling autocfg v1.1.0
...
   Compiling pyo3-macros v0.18.3
   Compiling rustfib v0.1.0 (/tmp/rustimport/rustfib-7eb3578b36e7d1e44917eae13823c3f8/rustfib)
    Finished dev [unoptimized + debuginfo] target(s) in 10.13s
In [4]: fibonacci(50)
Out[4]: 12586269025

We just imported a Rust file, it got compiled automatically, and then we ran the code!

If we look at the files in the directory we can see that a new compiled Python extension was created:

$ ls
rustfib.cpython-311-x86_64-linux-gnu.so  rustfib.rs

At this point we don’t need to import rustimport.import_hook: we can just import the extension because it’s already compiled:

$ python -c "import rustfib; print(rustfib.fibonacci(10))"
55

Before we talk about performance, let’s explain the code we wrote and what rustimport did for us.

Writing extensions with rustimport

rustimport reduces your effort in a number of different ways.

First, rustimport.import_hook will hook into Python’s import system; when it sees a Rust file with the relevant name, it will compile it to a Python extension and then import it. In order to prevent you from randomly importing any Rust file you have lying around, you need to have a special comment at the top of the file:

// rustimport:pyo3

Second, the typical way Python extensions are written in Rust is using a third-party library called PyO3. In a normal Rust setup you’d have to add pyo3 as a dependency; the // rustimport: pyo3 comment above tells rustimport to do that for you automatically.

Third, a normal PyO3 extension would require you to have a module initialization function, where you’d register the function. In our case it might look like this:

// This is optional if you're using rustimport:
#[pymodule]
fn rustfib(_py: Python<'_>, m: &PyModule) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(fibonacci, m)?)?;
    Ok(())
}

rustimport will generate this code for you automatically. You can write your own, however, if the default support for #[pyfunction] and #[pyclass] is not sufficient.

Fourth, you didn’t need to write or generate a Cargo.toml file, which most Rust packages would require.

Fixing the most common performance mistake beginners make with Rust

So how fast is our new extension, compare to the 954 nanoseconds our Python implementation used? Let’s measure it:

In [1]: import rustimport.import_hook

In [2]: from rustfib import fibonacci

In [3]: %timeit fibonacci(50)
920 ns ± 0.974 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Isn’t Rust supposed to be fast?

Not necessarily. If you go back to where we compiled the extension initially, it said Finished dev [unoptimized + debuginfo] target(s) in 10.13s. The unoptimized is a hint to why this code ran slowly.

When you compile Rust code there are different compilation profiles, which are compiled in different ways. For our purposes, there are two interesting ones:

  • dev: Has extra debugging assertions and info, but is not optimized for speed. Compilation may be faster in some situations.
  • release: No debugging assertions or info, optimized for speed.

If you’re benchmarking your Rust code, or just want it to run at full speed, you need to compile with the release profile. And it’s very easy to forget to do so!

  • If you’re using cargo, use cargo build --release.
  • If you’re using setuptools-rust or maturin, the normal packaging tools you’d use to build Rust extensions for Python, develop mode (pip install -e or maturin develop) will install using the dev profile. You need to pip install normally to get release mode.

In the case of rustimport, we can compile using the release profile in a couple of different ways (see the README for details). We’ll use the method that involves setting an option before importing the Rust module. First, we delete the existing compiled extension. Then:

In [1]: import rustimport.import_hook

In [2]: import rustimport.settings

In [3]: rustimport.settings.compile_release_binaries = True

In [4]: from rustfib import fibonacci
   Compiling target-lexicon v0.12.8
   ...
   Compiling pyo3-macros v0.18.3
   Compiling rustfib v0.1.0 (/tmp/rustimport/rustfib-7eb3578b36e7d1e44917eae13823c3f8/rustfib)
    Finished release [optimized] target(s) in 5.77s

Notice that it says Finished release [optimized] target(s); we now have a release build. Which means we can benchmark the code to get a more realistic sense of Rust’s performance capabilities:

In [5]: %timeit fibonacci(50)
44.7 ns ± 0.0438 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

Comparing all three versions:

Version Speed (lower is better)
Python 955 ns
Rust (dev profile) 920 ns
Rust (release profile) 45 ns

Using dependencies

What if your prototype wants to use a third-party Rust library? Typically with a Rust program you’d edit Cargo.toml, or just run cargo add thedependency on the command-line to add it automatically. But you don’t have a Cargo.toml when using rustimport.

Instead, you can add these clauses to your importable Rust file, with a special syntax:

// rustimport:pyo3

// Embedded Cargo.toml:
//: [dependencies]
//: fibext = "0.2"

use fibext::Fibonacci;
use pyo3::prelude::*;

#[pyfunction]
fn fibonacci(number: u64) -> u64 {
    Fibonacci::new().nth(number as usize).unwrap()
}

In this example, instead of calculating the Fibonacci sequence ourselves, we use the fibext package that already implements it.

Avoiding using old code by mistake

rustimport will notice if the Rust file and compiled extension have diverged, and recompile as necessary. However, if you’re not careful it’s easy to end up running old code even if you’ve edited the Rust file:

  1. If you forget to import rustimport.import_hook, an existing compiled extension from the filesystem will be used.
  2. If you don’t restart Python, it will just stick to a first version you imported; it won’t ever update the already-imported module.

When to use rustimport

Where rustimport shines is prototyping: it makes it very simple to just try out some Rust code and see how it runs with your code. The two main Rust integrations for Python packaging are less suitable for prototyping within an existing codebase than rustimport:

  • setuptools-rust requires updating your setup.py, and having an additional Cargo.toml, and setting up a directory structure for Rust. This is pretty quick, to be fair, but it’s still extra work.
  • maturin is an excellent tool for easily creating new Rust-based Python libraries, but is not really designed for adding Rust to existing projects.

In both cases you need to manually install or rebuild your Rust code every time it changes; not a lot of work, but it’s still more friction.

However, when it comes to shipping to production, I’d probably switch away from rustimport. For example, a single Rust file imported via rustimport lacks the standard Cargo.lock file normal Rust projects, which contains transitive locked dependencies. This means you won’t get reproducible builds with the same exact dependencies each time. And while rustimport does scale up to full Rust projects, at that point its benefits decline, and it’s likely better to switch to a more explicit build system.