The easiest way to speed up Python with Rust
If you want to speed up some existing Python code, writing a compiled extension in Rust can be an excellent choice:
- In many situations, Rust code can run much faster than Python.
- Rust prevents most of the memory-management bugs that often occur in C, C++, and Cython code.
- There is a growing ecosystem of third-party Rust packages available, and unlike C and C++ it also has a built-in package manager and build system.
However, if you just want to prototype a Rust extension, packaging and integration boilerplate can get in the way: every extra bit of friction prevents you from just doing the experiment of whether or not Rust will help.
This is where rustimport
comes in, a library that makes standalone Rust files easily importable in Python (currently only on Linux and macOS).
In this article we’ll cover:
- How to use
rustimport
to quickly try out your Rust code. - The most common performance mistake Rust beginners make, and how to avoid it.
- Some gotchas when using
rustimport
.
Note: As is typically the case for Rust extensions, you’ll still be using PyO3 to bridge Python and Rust, but as we’ll see
rustimport
makes it extra easy to use PyO3 compared to alternative packaging methods.
Prerequisites
rustimport
compiles your Rust code on the fly, so you’re going to need to have a Rust compiler installed on your computer.
This is not what you want for your final packaged library or application, but our use case is prototyping, where you’d need a compiler regardless.
You also need to have rustimport
installed, for example by running pip install rustimport
in a virtualenv or Conda environment.
Example: Calculating the Fibonacci sequence
We’re going to implement the Fibonacci sequence in Rust, but we’ll start by implementing it in Python so we have a basis for comparing the performance:
def fibonacci(number: int) -> int:
if number == 0:
return 0
if number == 1:
return 1
prevprev = 0
prev = 1
current = 1
for _ in range(number - 1):
current = prevprev + prev
prevprev = prev
prev = current
return current
We can then time this implementation using the IPython interactive prompt’s %timeit
magic:
In [1]: from pyfib import fibonacci
In [2]: %timeit fibonacci(50)
954 ns ± 1.66 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
Implementing the Fibonacci sequence in Rust
Let’s implement the same function in Rust, and expose it to Python.
We’ll create a new standalone file, rustfib.rs
, that looks like this:
// rustimport:pyo3
// PyO3 is a Rust library for writing Python extensions;
// we'll import its most commonly used APIs.
use pyo3::prelude::*;
#[pyfunction] // ← expose the function to Python
fn fibonacci(number: u64) -> u64 {
if number == 0 {
return 0;
}
if number == 1 {
return 1;
}
let mut prevprev = 0;
let mut prev = 1;
let mut current = 1;
for _ in 0..(number - 1) {
current = prevprev + prev;
prevprev = prev;
prev = current;
}
current
}
We’ll go over much of what this means in a little bit, but before that let’s see how it runs.
By default, if we try importing a Rust file in a Python prompt, it won’t work.
Python doesn’t know how to import files ending with .rs
:
In [1]: from rustfib import fibonacci
ModuleNotFoundError: No module named 'rustfib'
This where rustimport
comes in.
As the name implies, it lets you import Rust files; all you have to do is import rustimport.import_hook
first.
In [2] import rustimport.import_hook
In [3]: from rustfib import fibonacci
Updating crates.io index
Downloaded proc-macro2 v1.0.64
Downloaded 1 crate (44.8 KB) in 0.32s
Compiling target-lexicon v0.12.8
Compiling autocfg v1.1.0
...
Compiling pyo3-macros v0.18.3
Compiling rustfib v0.1.0 (/tmp/rustimport/rustfib-7eb3578b36e7d1e44917eae13823c3f8/rustfib)
Finished dev [unoptimized + debuginfo] target(s) in 10.13s
In [4]: fibonacci(50)
Out[4]: 12586269025
We just imported a Rust file, it got compiled automatically, and then we ran the code!
If we look at the files in the directory we can see that a new compiled Python extension was created:
$ ls
rustfib.cpython-311-x86_64-linux-gnu.so rustfib.rs
At this point we don’t need to import rustimport.import_hook
: we can just import the extension because it’s already compiled:
$ python -c "import rustfib; print(rustfib.fibonacci(10))"
55
Before we talk about performance, let’s explain the code we wrote and what rustimport
did for us.
Writing extensions with rustimport
rustimport
reduces your effort in a number of different ways.
First, rustimport.import_hook
will hook into Python’s import system; when it sees a Rust file with the relevant name, it will compile it to a Python extension and then import it.
In order to prevent you from randomly importing any Rust file you have lying around, you need to have a special comment at the top of the file:
// rustimport:pyo3
Second, the typical way Python extensions are written in Rust is using a third-party library called PyO3.
In a normal Rust setup you’d have to add pyo3
as a dependency; the // rustimport: pyo3
comment above tells rustimport
to do that for you automatically.
Third, a normal PyO3 extension would require you to have a module initialization function, where you’d register the function. In our case it might look like this:
// This is optional if you're using rustimport:
#[pymodule]
fn rustfib(_py: Python<'_>, m: &PyModule) -> PyResult<()> {
m.add_function(wrap_pyfunction!(fibonacci, m)?)?;
Ok(())
}
rustimport
will generate this code for you automatically.
You can write your own, however, if the default support for #[pyfunction]
and #[pyclass]
is not sufficient.
Fourth, you didn’t need to write or generate a Cargo.toml
file, which most Rust packages would require.
Fixing the most common performance mistake beginners make with Rust
So how fast is our new extension, compare to the 954 nanoseconds our Python implementation used? Let’s measure it:
In [1]: import rustimport.import_hook
In [2]: from rustfib import fibonacci
In [3]: %timeit fibonacci(50)
920 ns ± 0.974 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
Isn’t Rust supposed to be fast?
Not necessarily.
If you go back to where we compiled the extension initially, it said Finished dev [unoptimized + debuginfo] target(s) in 10.13s
.
The unoptimized
is a hint to why this code ran slowly.
When you compile Rust code there are different compilation profiles, which are compiled in different ways. For our purposes, there are two interesting ones:
dev
: Has extra debugging assertions and info, but is not optimized for speed. Compilation may be faster in some situations.release
: No debugging assertions or info, optimized for speed.
If you’re benchmarking your Rust code, or just want it to run at full speed, you need to compile with the release
profile.
And it’s very easy to forget to do so!
- If you’re using
cargo
, usecargo build --release
. - If you’re using
setuptools-rust
ormaturin
, the normal packaging tools you’d use to build Rust extensions for Python,develop
mode (pip install -e
ormaturin develop
) will install using thedev
profile. You need topip install
normally to get release mode.
In the case of rustimport
, we can compile using the release profile in a couple of different ways (see the README for details).
We’ll use the method that involves setting an option before importing the Rust module.
First, we delete the existing compiled extension.
Then:
In [1]: import rustimport.import_hook
In [2]: import rustimport.settings
In [3]: rustimport.settings.compile_release_binaries = True
In [4]: from rustfib import fibonacci
Compiling target-lexicon v0.12.8
...
Compiling pyo3-macros v0.18.3
Compiling rustfib v0.1.0 (/tmp/rustimport/rustfib-7eb3578b36e7d1e44917eae13823c3f8/rustfib)
Finished release [optimized] target(s) in 5.77s
Notice that it says Finished release [optimized] target(s)
; we now have a release
build.
Which means we can benchmark the code to get a more realistic sense of Rust’s performance capabilities:
In [5]: %timeit fibonacci(50)
44.7 ns ± 0.0438 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
Comparing all three versions:
Version | Speed (lower is better) |
---|---|
Python | 955 ns |
Rust (dev profile) | 920 ns |
Rust (release profile) | 45 ns |
Using dependencies
What if your prototype wants to use a third-party Rust library?
Typically with a Rust program you’d edit Cargo.toml
, or just run cargo add thedependency
on the command-line to add it automatically.
But you don’t have a Cargo.toml
when using rustimport
.
Instead, you can add these clauses to your importable Rust file, with a special syntax:
// rustimport:pyo3
// Embedded Cargo.toml:
//: [dependencies]
//: fibext = "0.2"
use fibext::Fibonacci;
use pyo3::prelude::*;
#[pyfunction]
fn fibonacci(number: u64) -> u64 {
Fibonacci::new().nth(number as usize).unwrap()
}
In this example, instead of calculating the Fibonacci sequence ourselves, we use the fibext
package that already implements it.
Avoiding using old code by mistake
rustimport
will notice if the Rust file and compiled extension have diverged, and recompile as necessary.
However, if you’re not careful it’s easy to end up running old code even if you’ve edited the Rust file:
- If you forget to import
rustimport.import_hook
, an existing compiled extension from the filesystem will be used. - If you don’t restart Python, it will just stick to a first version you imported; it won’t ever update the already-imported module.
When to use rustimport
Where rustimport
shines is prototyping: it makes it very simple to just try out some Rust code and see how it runs with your code.
The two main Rust integrations for Python packaging are less suitable for prototyping within an existing codebase than rustimport
:
setuptools-rust
requires updating yoursetup.py
, and having an additionalCargo.toml
, and setting up a directory structure for Rust. This is pretty quick, to be fair, but it’s still extra work.maturin
is an excellent tool for easily creating new Rust-based Python libraries, but is not really designed for adding Rust to existing projects.
In both cases you need to manually install or rebuild your Rust code every time it changes; not a lot of work, but it’s still more friction.
However, when it comes to shipping to production, I’d probably switch away from rustimport
.
For example, a single Rust file imported via rustimport
lacks the standard Cargo.lock
file normal Rust projects, which contains transitive locked dependencies.
This means you won’t get reproducible builds with the same exact dependencies each time.
And while rustimport
does scale up to full Rust projects, at that point its benefits decline, and it’s likely better to switch to a more explicit build system.