Catching memory leaks with your test suite

Resource leaks are an unpleasant type of bug. Little by little your program uses more memory, or more file descriptors, or some other limited resource. Everything seems fine—until you run, and now your program is dead.

In many cases you can catch these sort of bugs in advance, by tweaking your test suite. Or, after you’ve discovered such a bug, you can use your test suite to identify what is causing it. In this article we’ll cover:

  • An example of a memory leaks.
  • When your test suite may be a good way to identify the causes of leaks.
  • How to catch leaks using pytest.
  • Other types of leaks.

Memory leaks

Typically your program allocate some memory, and then eventually frees it. In Python memory gets freed automatically once objects no longer have any references (or more accurately, no references from modules or running code). In compiled languages memory might need to be freed manually.

What happens if you don’t free memory? As your process uses more and more memory, it will eventually slow down, crash, or get killed silently For example, I run this Python program:

$ python -m mypackage

Eventually it died; initially I wasn’t sure if it gave an error message or not, since at the same time my terminal got stuck and had to be killed too. Here’s what the Linux kernel logs said when I ran sudo dmesg:

...
[185329.418692] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=user.slice,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/user@1000.service/app.slice/app-gnome-emacs-22313.scope,task=python,pid=729871,uid=1000
[185329.418713] Out of memory: Killed process 729871 (python) total-vm:42680276kB, anon-rss:42213556kB, file-rss:1600kB, shmem-rss:0kB, UID:1000 pgtables:83572kB oom_score_adj:100

The program used to much memory, so the operating system killed it.

Identifying memory leaks with your test suite

One approach to identifying memory leaks is to use a memory profiler like Memray, Fil (easier to use, but less featureful), or Sciagraph (designed for memory and performance profiling of batch jobs in both development and production.) But that may be difficult depending on runtime environment, and the kind of program you’re using; for example, Sciagraph won’t help you with a webserver running in production, it’s designed for batch jobs.

Another approach is to see if you can spot the memory leak using your test suite. In particular, this can work if:

  1. Your test suite has good coverage.
  2. The memory leak is easy to reproduce and doesn’t just happen in rare setups. You won’t know this in advance, of course.
  3. The memory leak is big enough to be noticeable to be running the tests. Again, you won’t know this advance.

If these conditions are true, and in many cases they will be, you can help identify the source of memory (or other resource leaks) with your test suite. Specifically for memory leaks, for every test:

  1. Before the test runs, measure how much memory is currently used.
  2. After the test finishes, measure memory use again.
  3. If the second number is higher, that suggests some memory was leaked by the test.

Implementing resource leak fixtures with pytest

There are various ways to measure memory usage:

  • tracemalloc is built-in to Python, and pretty granular in what it measures; the downside is that not all 3rd party compiled extensions integrate tracemalloc, so for example Polars’ memory allocations won’t show up there.
  • Another option is psutil.Process().memory_full_info().uss, which won’t be able to notice small allocations but will work for any library since it’s measuring process-level operating system info.

We’re going to use tracemalloc.

Our example package has tests in mypackage/tests/ that run with the popular pytest testing framework:

$ pytest
=================================================
test session starts
=================================================
collected 9 items

mypackage/tests/test_api.py .........     [100%]

=================================================
9 passed in 0.01s
=================================================

One of pytest’s features is fixtures: they allow you to run additional code before and after a test, and to expose additional resources to a test. For our purposes, we want a fixture that automatically runs for all tests. At the same time, we want to make it conditional, since checking for memory leaks will slow down execution. So we create an autouse fixture, and make it conditional on an environment variable being set.

We put this code into a file called mypackage/tests/conftest.py; pytest automatically loads conftest.py modules on startup.

import gc
import os
import tracemalloc as tmalloc

import pytest


if os.getenv("CHECK_LEAKS") == "1":
    @pytest.fixture(autouse=True)
    def check_for_memory_leaks():
        # This code gets run _before_ the test:
        tmalloc.start()
        # Garbage collect aggressively to clear memory:
        gc.collect()
        # Get current memory usage:
        current_mem_usage = tmalloc.get_traced_memory()[0]

        try:
            # The test will get run here:
            yield
        finally:
            # This code runs _after_ the test:
            gc.collect()
            final_mem_usage = tmalloc.get_traced_memory()[0]
            # In practice memory usage change is never zero,
            # so fail only if more than 10KB was leaked.
            # Real-world leaks might require tweaking this.
            assert (
                final_mem_usage - current_mem_usage < 10_000
            ), "memory was leaked"
            tmalloc.stop()

Debugging our memory leak

Now that we’ve added our fixture, we can use it to see if any tests are triggering memory leaks:

$ env CHECK_LEAKS=1 pytest mypackage/
...
mypackage/tests/test_api.py .....E....    [100%]
...

>           assert (
                final_mem_usage - current_mem_usage < 10_000
            ), "memory was leaked"
E           AssertionError: memory was leaked
E           assert (398827 - 0) < 10000

mypackage/tests/conftest.py:28: AssertionError
...
mypackage/tests/test_api.py::test_api - AssertionError: memory was leaked
...

So now we know that a specific test is triggering memory leaks. We go read that test:

from mypackage import api

# ...

def test_api():
    result = api.make_list_range(10_000)
    for i in range(10_000):
        assert result[i] == i

# ...

Now we know which business logic API is triggering memory leaks. We read the code for that function:

from functools import cache

# ...

@cache
def make_list_range(n: int) -> list[int]:
    return list(range(n))

# ...

This function is decorated with functools.cache, which means all its results will be cached in memory, keyed on the input. This means the code will run faster, but also means memory can grow indefinitely if it’s called with many different arguments. Perhaps it would be better to switch functools.lru_cache which only caches a limited number of results, or perhaps we should disable caching altogether.

Either way, we’ve identified the cause of the memory leak.

Catching other resource leaks

Memory is not the only resource that can leak. For example, in some programming domains leaking file descriptors can be a problem. Every time you open a file or socket, Linux and other Unix-like operating systems create file descriptors. There’s a limit to how many file descriptors a process can have, so if for example you keep leaking open files, eventually your process will run out and won’t be able to open additional files or sockets.

Other resources that can leak include GPU memory, database handles, disk space, and more broadly any resource that has a fixed limit. Much like memory leaks, you may be able catch these sort of leaks via your test suite, with a similar fixture. For example, instead of measuring memory usage in the fixture, you instead measure the number of open file descriptors.

A good test suite is a resource not just for ensuring your code has the correct semantics and fulfills the necessary contracts, but also in some cases for runtime characteristics as well. Invest in it accordingly.