Dying, fast and slow: out-of-memory crashes in Python

by Itamar Turner-Trauring
Last updated 12 Jan 2023, originally created 15 Jan 2021

A segfaulting program might be the symptom of a bug in C code–or it might be that your process is running out of memory. Crashing is just one symptom of running out of memory. Your process might instead just run very slowly, your computer or VM might freeze, or your process might get silently killed. Sometimes if you’re lucky you might even get a nice traceback, but then again, you might not.

So how do you identify out-of-memory problems?

With some understanding of how memory works in your operating system and in Python, you can learn to identify all the different ways out-of-memory problems can manifest:

A slow death.
An obvious death.
A corrupted death.
Death by assassination.

A slow death: Swapping

When your computer’s RAM fills up, the operating system will start moving chunks of memory out of RAM and on to your disk, aka “swapping”. Specifically, it will try to move chunks of memory that aren’t being used. When some code tries to read or write to these chunks they will get loaded back into RAM.

Now, your disk is much slower than RAM, so this can lead to slowness if you’re doing a lot of swapping. At the extreme, your computer will technically still be running but for practical purposes will be completely locked up as the amount of data the operating is system is trying to read and write to disk exceeds the disk’s bandwidth. This is more common on personal computers, where you’re running many different programs that might use and touch a lot of memory: a browser, an IDE, the program you’re testing, and so on.

Even with swapping, if you allocate enough memory you’ll eventually run out of the combined RAM and swap space. At this point allocating more memory is impossible.

An obvious death: MemoryError tracebacks and other error messages

What happens when you can’t allocate any more memory?

When using Python, this will often result in the interpreter’s memory allocation APIs failing to allocate. At this point, Python will try to raise a MemoryError exception.

>>> import numpy
>>> numpy.ones((1_000_000_000_000,))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.9/site-packages/numpy/core/numeric.py", line 192, in ones
    a = empty(shape, dtype, order)
MemoryError: Unable to allocate 7.28 TiB for an array with shape (1000000000000,) and data type float64

This will get handled by whatever mechanisms your program uses for unexpected exceptions; with any luck, it’ll end up in the logs or terminal for you to read.

Of course, handling and printing that traceback also uses memory. So this sort of useful, clear traceback is much more likely when you did a large allocation that didn’t fit in memory. The big allocation fails, which means available memory doesn’t change, and hopefully there’s enough available to handle the exception.

If there’s not enough memory to handle the error reporting my expectation is that you will eventually get some sort of crash, e.g. due to a stack overlow as you get an infinite recursion of creating a new MemoryError failing due to lack of memory.

A disguised death: Segfaults in C code

Under the hood, Python eventually delegates memory allocation to the standard C library APIs malloc(), free(), and related functions. When you call malloc(), you get back the address of a newly allocated chunk of memory. And if allocation fails you’ll get back NULL, the address 0.

For example, here we see the first allocation returns the address of the newly allocated memory:

>>> import ctypes
>>> libc = ctypes.CDLL("libc.so.6")
>>> libc.malloc(100)
439108304

The second allocation fails because I asked for far too much memory:

>>> libc.malloc(1_000_000_000_000)
0

Now, well-behaved code will check for a NULL returned from malloc(), and handle the error as best it can. It might exit with an error, or as we saw the Python interpreter raises a MemoryError exception.

Buggy code will assume malloc() always return successfully, and treat the 0 returned by malloc() as a valid memory address. Trying to write to address 0 will then result in a crash.

Consider the following C program:

#include <stdlib.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
  char *p;
  size_t allocation_size = strtoll(argv[1], &p, 10);

  char* data = malloc(allocation_size);
  /* Uh oh, didn't check for NULL return result. */
  
  data[0] = 'O';
  data[1] = 'K';
  data[2] = '\n';
  data[3] = 0;
  printf("%s", data);
}

It allocates some memory based on the first argument, and then writes to it—but it doesn’t check for a failed allocation.

If I run it with a small, successful allocation, everything works fine:

$ ./naive-malloc 1000
OK

But if the allocation is too big, the program crashes because it doesn’t have any error handling code.

$ ./naive-malloc 10000000000000000
Segmentation fault (core dumped)

Of course, segfaults happen for other reasons as well, so to figure out the cause you’ll need to inspect the core file with a debugger like gdb, or run the program under the Fil memory profiler.

Death by assassination: The out-of-memory killer

On Linux and macOS, there is another way your process might die: the operating system can decide your process is using too much memory, and kill it preemptively. The symptom will be your program getting killed with SIGKILL (kill -9), with a corresponding exit code.

On Linux, you can see OOM killer logs in dmesg, and either /var/log/messages or /var/log/kern.log, depending on your distribution. Notifications are available via cgroups v1 or v2 (most distributions still use the former).
I am not sure where to find logs on macOS, but you can learn more about macOS out-of-memory handling and notifications here.

Debugging and preventing out-of-memory issues

Out-of-memory conditions can result in a variety of failure modes, from slowness to crashes, and the relevant information might end up in stderr, the application-level logging, system-level logging, or implicit in a core dump file. This makes debugging the cause of the problem rather tricky.

There are some ways to improve the situation, however.

For production server workloads, setting memory limits and limiting swap will at least keep memory leaks from freezing things up—eventually the leaky process will get killed and restarted.
For production batch data processing, you can model expected memory usage based on input size and then ensure you’re running with enough memory upfront.
If you’re running processes manually, you can use the open source Fil memory profiler to automatically catch and debug Python out-of-memory problems.

Learn even more techniques for reducing memory usage—read the rest of the Larger-than-memory datasets guide for Python.

Find performance and memory bottlenecks in your data processing code with the Sciagraph profiler

Slow-running jobs waste your time during development, impede your users, and increase your compute costs. Speed up your code and you’ll iterate faster, have happier users, and stick to your budget—but first you need to identify the cause of the problem.

Find performance bottlenecks and memory hogs in your data science Python jobs with the Sciagraph profiler. Profile in development and production, with multiprocessing support, on macOS and Linux, with built-in support for Jupyter notebooks.

Speed up your Python code and learn skills you can use at your job

Join over 8000 Python developers and data scientists learning practical tools and techniques every week, from Python performance to Docker packaging, by signing up for my newsletter.