Debugging Python out-of-memory crashes with the Fil profiler
You run your Python program, and it crashes—it’s out of memory. And this is just one of the multiple ways your program can fail in out-of-memory situations.
How do you figure out what is using up all your program’s memory?
One way to do that is with the open source Fil memory profiler, which specifically supports debugging out-of-memory crashes. Let’s see how to use it.
Consider the following Python program:
import numpy as np ALLOCATIONS =  def add1(x): ALLOCATIONS.append(np.ones((1024 * 1024 * x))) def add2(): add1(5) add1(2) def main(): while True: add2() add1(3) x = np.ones((1024 * 1024,)) main()
When I run this program the process is killed by the Linux out-of-memory killer. No traceback is printed.
$ python oom.py Killed
Now, in this case the program is simple enough that you can figure out the memory leak from reading it, but real programs won’t be so easy. So what you want is a tool to help you debug the situation, a tool like the Fil memory profiler.
Using the Fil memory profiler
To help you debug out of memory crashes, the Fil memory profiler includes support for dumping out current memory allocations at the time of a crash. In fact, it will do its best to catch the problem early, before your computer massively slows down or your process is killed by the operating system.
Let’s see how you use Fil to debug this.
First, install Fil (Linux and macOS only at the moment) either with pip inside a virtualenv:
$ pip install --upgrade pip $ pip install filprofiler
Or with Conda:
$ conda install -c conda-forge filprofiler
Make sure you’re using v0.14.1 or later, since that includes improved out-of-memory detection.
Next, we run the program under Fil. Fil detects the out-of-memory condition and writes out a report saying what memory allocations were made leading up to running out of memory:
$ fil-profile run oom.py ... =fil-profile= Wrote memory usage flamegraph to fil-result/2020-06-15T12:37:13.033/out-of-memory.svg =fil-profile= Wrote memory usage flamegraph to fil-result/2020-06-15T12:37:13.033/out-of-memory-reversed.svg
out-of-memory.svg looks like:
As you can see, this shows exactly where all the memory came from at the time the process ran out of memory. Which means you now have a starting point for reducing that memory usage.
In addition, Fil will always exit with exit code 53 if you run out of memory, making it easy to identify out-of-memory issues if you’re running it in an automated fashion.
Memory use too high? Try Fil
Fil can help you figure out where your crashing program is allocating its memory. But it can also help you with non-crashing programs, by measuring peak usage of your data processing program.
Once you’ve measured memory use and know where it’s coming from, you can start applying a variety of techniques to reduce memory usage.
Learn even more techniques for reducing memory usage—read the rest of the Larger-then-memory datasets guide for Python.
Wasting compute money on processes that use too much memory?
Your Python batch process is using too much memory, and you have no idea which part of your code is responsible.
You need a tool that will tell you exactly where to focus your optimization efforts, a tool designed for data scientists and scientists. Learn how the Fil memory profiler can help you.
How do you process large datasets with limited memory?
Get a free cheatsheet summarizing how to process large amounts of data with limited memory using Python, NumPy, and Pandas.
Plus, every week or so you’ll get new articles showing you how to process large data, and more generally improve you software engineering skills, from testing to packaging to performance: