Debugging Python out-of-memory crashes with the Fil profiler
You run your Python program, and it crashes—it’s out of memory. And this is just one of the multiple ways your program can fail in out-of-memory situations.
How do you figure out what is using up all your program’s memory?
One way to do that is with the open source Fil memory profiler, which specifically supports debugging out-of-memory crashes. Let’s see how to use it.
Consider the following Python program:
import numpy as np ALLOCATIONS =  def add1(x): ALLOCATIONS.append(np.ones((1024 * 1024 * x))) def add2(): add1(5) add1(2) def main(): while True: add2() add1(3) x = np.ones((1024 * 1024,)) main()
When I run this program the process is killed by the Linux out-of-memory killer. No traceback is printed.
$ python oom.py Killed
Now, in this case the program is simple enough that you can figure out the memory leak from reading it, but real programs won’t be so easy. So what you want is a tool to help you debug the situation, a tool like the Fil memory profiler.
Using the Fil memory profiler
To help you debug out of memory crashes, the Fil memory profiler includes support for dumping out current memory allocations at the time of a crash. In fact, it will do its best to catch the problem early, before your computer massively slows down or your process is killed by the operating system.
Let’s see how you use Fil to debug this.
First, install Fil (Linux and macOS only at the moment) either with pip inside a virtualenv:
$ pip install --upgrade pip $ pip install filprofiler
Or with Conda:
$ conda install -c conda-forge filprofiler
Make sure you’re using v0.14.1 or later, since that includes improved out-of-memory detection.
Next, we run the program under Fil. Fil detects the out-of-memory condition and writes out a report saying what memory allocations were made leading up to running out of memory:
$ fil-profile run oom.py ... =fil-profile= Wrote memory usage flamegraph to fil-result/2020-06-15T12:37:13.033/out-of-memory.svg =fil-profile= Wrote memory usage flamegraph to fil-result/2020-06-15T12:37:13.033/out-of-memory-reversed.svg
out-of-memory.svg looks like:
As you can see, this shows exactly where all the memory came from at the time the process ran out of memory. Which means you now have a starting point for reducing that memory usage.
In addition, Fil will always exit with exit code 53 if you run out of memory, making it easy to identify out-of-memory issues if you’re running it in an automated fashion.
Memory use too high? Try Fil
Fil can help you figure out where your crashing program is allocating its memory. But it can also help you with non-crashing programs, by measuring peak usage of your data processing program.
Once you’ve measured memory use and know where it’s coming from, you can start applying a variety of techniques to reduce memory usage.
Learn even more techniques for reducing memory usage—read the rest of the Larger-than-memory datasets guide for Python.
Find performance and memory bottlenecks in your data processing code with the Sciagraph profiler
Slow-running jobs waste your time during development, impede your users, and increase your compute costs. Speed up your code and you’ll iterate faster, have happier users, and stick to your budget—but first you need to identify the cause of the problem.
Find performance bottlenecks and memory hogs in your data science Python jobs with the Sciagraph profiler. Profile in development and production, with multiprocessing support, on macOS and Linux, with built-in support for Jupyter notebooks.
Learn practical Python software engineering skills you can use at your job
Sign up for my newsletter, and join over 7000 Python developers and data scientists learning practical tools and techniques, from Python performance to Docker packaging, with a free new article in your inbox every week.