Faster pip installs: caching, bytecode compilation, and uv
Installing your Python application’s dependencies can be surprisingly slow. Whether you’re running tests in CI, building a Docker image, or installing an application, downloading and installing dependencies can take a while.
So how do you speed up installation with pip
?
In this article I’ll cover:
- Avoiding the slow path of installing from source.
- The package cache.
- Bytecode compilation and how it interacts with installation and startup speed.
- Using
uv
, a faster replacement forpip
, and why it’s not always as fast as it might initially seem.
Avoiding installs from source
When you install a Python package, there are two ways you can install it, typically:
- The packaged up source file, often a
.tar.gz
with apyproject.toml
or (for old packages) asetup.py
. In this case, installing will often require running Python code (a little slow), and sometimes compiling large amounts of C/C++/Rust code (potentially extremely slow). - A wheel (
.whl
files) that can just be unpacked straight on to the filesystem, with no need to run code or compile native extensions.
If at all possible, you want to install wheels, because installing from source will be slower. If you need to compile significant amounts of C code, installing from source will be much slower; instead of relying on precompiled binaries, you’ll need to compile it all yourself.
To ensure you’re installing wheels in many cases as possible:
- Make sure you’re using the latest version of
pip
before installing dependencies. Binary wheels sometimes require newer versions ofpip
than the one packaged by default by your current Python. Or better yet, as we’ll discuss below, useuv
. - Don’t use Alpine Linux; stick to Linux distributions that use
glibc
, e.g Debian/Ubuntu/RedHat/etc.. Standard Linux wheels requireglibc
, but Alpine uses themusl
C library. Wheels formusl
-based distributions like Alpine are available for many projects, but they’re less common.
Note: If you maintain open source or otherwise installable Python packages: since wheels install faster, make sure to provide wheels for your package, even if it’s pure Python.
Keeping the package cache warm
Once you’ve dealt with that, the next question is whether you can avoid downloading the packages. Installing Python packages involves two steps:
- Downloading the package, if necessary.
- Installing the already downloaded package.
To speed up the first step, when a package manager like pip
downloads a package, it will typically store a copy locally in a cache: some tool-specific directory on the filesystem.
That means the next time you install that package, the package manager can check if it already has a cached copy, and if so just install that.
This saves download times.
If the package is already in the cache, we say the cache is “warm”. Here’s how this impacts performance, measuring both wallclock and CPU time:
Tool | Cache | Wallclock time | CPU time |
---|---|---|---|
pip install |
Cold | 8.5s | 6.1s |
pip install |
Warm | 6.3s | 5.6s |
This difference in speed is tied to the latency bandwidth on my Internet connection, so it could be better or worse in other locations. In general, however, the cold cache version will always be slower.
Benchmarking methodology: I made sure to create the virtualenvs in advance, and I used hashes in the
requirements.txt
since that really should be the default for security reasons. I used the transitive dependencies for installingpandas
andmatplotlib
, resulting in the installation of 14 different packages in total. I used Python 3.13, on a CPU with ~20 cores,pip
version 24.3.1 anduv
version 0.5.22.
One problem with the package cache is that in most CI services, your cache will start out empty, since you’re starting with a new blank virtual machine or container. To work around that, most CI systems will have some way to store a cache directory at the end of the run, and then load it at the beginning of the next run. If you’re using GitHub Actions, you can use the built-in caching support in the action used to setup Python (you’re going to be caching the cache!).
Of course, storing and loading the cache also takes time, so if you have many or large dependencies try it both ways and see which is faster.
Reallocating slowness by disabling bytecode compilation
Another task that can slow down package installation is bytecode compilation.
After packages are unpacked on to the filesystem, package managers sometimes do one final step: they compile the .py
source files into .pyc
bytecode files, and store them in __pycache__
directories.
This is not the same as compiling a C extension, this is just an optimization to make loading Python code faster on startup.
Instead of having to compile the .pyc
at import time, the .pyc
is already there.
It turns out that bytecode compilation takes a significant amount of the time spent by pip install
, since it’s on by default.
We can see the performance impact of this step by calling pip install --no-compile
.
Here’s a comparison of how long it takes to install packages both with and without .pyc
compilation, with a warm cache:
Installation method | Cache | Wallclock time | CPU time |
---|---|---|---|
pip install |
Warm | 6.3s | 5.6s |
pip install --no-compile |
Warm | 2.5s | 1.8s |
Importantly, just because disabling bytecode compilation speeds up installation doesn’t mean you’ve saved time overall.
Any module you import will still need to be compiled into a .pyc
, it’s just that the work will happen when your program runs, instead of at package installation time.
So if you’re importing all or most modules, overall you might not save any time at all, you’ve just moved the work to a different place.
In other cases, however, disabling bytecode compilation will save you time. For example, in your testing setup you might be installing many third-party packages for integration testing, but only using a small amount of those libraries’ code. As such, there’s no point in compiling lots of modules you won’t be using.
Switching to uv
, a faster reimplementation of pip
uv
is a mostly compatible re-implementation of pip
and other related tools.
Out of the box, uv
is much faster, because it:
- Is written in Rust, a faster language than Python.
- Downloads packages in parallel.
- Takes advantage of multiple CPUs.
- Disables the bytecode compilation by default, having it be opt-in as opposed to
pip
’s opt-out.
In their default configuration, uv pip install
is much faster than pip install
:
Tool | Cache | Wallclock time | CPU time |
---|---|---|---|
pip install |
Cold | 8.5s | 6.1s |
pip install |
Warm | 6.3s | 5.6s |
uv pip install |
Cold | 1.7s | 1.0s |
uv pip install |
Warm | 0.0s | 0.1s |
However, this is somewhat misleading.
With matching settings, uv
’s performance lead declines
By default, as mentioned above, pip
will do bytecode compilation and uv
will disable it.
The above table therefore isn’t a fair comparison.
What happens if both tools have enabled bytecode compilation, by doing uv pip install --compile-bytecode
?
Tool | Cache | Wallclock time | CPU time |
---|---|---|---|
pip install |
Cold | 8.5s | 6.1s |
pip install |
Warm | 6.3s | 5.6s |
uv pip install --compile-bytecode |
Cold | 2.4s | 11.2s |
uv pip install --compile-bytecode |
Warm | 0.5s | 9.8s |
Wallclock time is still much faster, though less so, but the measured CPU time suggests uv
is actually slower than pip
when bytecode compilation is enabled.
This combination is possible because it’s using multiple threads and taking advantage of multiple cores, and my CPU has 20 cores.
In CI the number of cores is likely much smaller: default x86-64 GitHub Actions Linux builders have four “cores”. It wouldn’t surprise me if these were “vCPUs” and actually effectively just two physical CPU cores. In any case, 4 cores is rather less than the 20 on the computer I used to test this.
The slower CPU time is not as bad as it looks, however.
In order to compile bytecode, uv
launches Python worker processes, which have a fixed startup overhead.
My guess was that with fewer cores, uv
will use fewer threads by default and therefore launch fewer worker processes.
And in fact when using just a single CPU core:
Tool run with single CPU core | Cache | Wallclock time | CPU time |
---|---|---|---|
pip install |
Cold | 9.3s | 6.2s |
pip install |
Warm | 6.2s | 5.6s |
uv pip install --compile-bytecode |
Cold | 6.0s | 4.9s |
uv pip install --compile-bytecode |
Warm | 4.1s | 4.1s |
In this scenario, pip
performance is unchanged other than some noise due to download speed; it’s single-threaded, after all.
Meanwhile uv
is still faster than pip
… but a lot less so.
Make your package installation faster
Some takeaways:
- Test preserving your package download cache in CI to reduce the need for downloads; it might or might not help.
uv
is faster thanpip
, though how much faster varies on configuration.- Decide whether or not bytecode compilation makes sense in your case.
- Once you’ve switched to
uv
, you’ll likely benefit from more CPU cores. That being said, when you have a warm cache and no compilation,uv
is so fast that it doesn’t really matter how many cores you have!