Clinging to memory: how Python function calls can increase memory use

by Itamar Turner-Trauring
Last updated 06 Jan 2023, originally created 23 Jun 2020

Unlike languages like C, much of the time Python will free up memory for you. But sometimes, it won’t work the way you expect it to.

Consider the following Python program—how much memory do you think it will use at peak?

import numpy as np

def load_1GB_of_data():
    return np.ones((2 ** 30), dtype=np.uint8)

def process_data():
    data = load_1GB_of_data()
    return modify2(modify1(data))

def modify1(data):
    return data * 2

def modify2(data):
    return data + 10

process_data()

Presuming we can’t mutate the original data, the best we can do is peak memory of 2GB: for a brief moment in time both the original 1GB of data and the modified copy of the data will need to be present. In practice, actual peak usage will be 3GB—lower down you’ll see an actual memory profiling result demonstrating that.

The best we can do is 2GB, actual use is 3GB: where did that extra 1GB of memory usage come from? The interaction of function calls with Python’s memory management.

To understand why, and what you can do to fix it, this will article will cover:

A quick overview of how Python automatically manages memory for you.
How functions impact Python’s memory tracking.
What you can do to fix this problem.

How Python’s automatic memory management makes your life easier

In some programming languages you need to explicitly deallocate any memory you allocated. A C program, for example, might do:

uint8_t *arr = malloc(1024 * 1024);
// ... do work with array ...
free(arr);

If you don’t manually free() the memory allocated by malloc(), it will never get freed.

Python, in contrast, tracks objects and frees their memory automatically when they’re no longer used. But sometimes that fails, and to understand why you need to understand how it tracks them.

To a first approximation, the default Python implementation does this using reference counting:

Each object has a counter of the number of places it’s being used.
When a new place/object gets a reference to the object, the counter is incremented by 1.
When a reference goes away, the counter is decremented by 1.
When the counter hits 0, the object’s memory is freed, since no one refers to it.

There are some additional mechanisms (“garbage collection”) to deal with circular references, but those aren’t relevant to the topic at hand.

How functions interact with Python memory management

One way you can add a reference to an object is by adding it to another object: a list, a dictionary, an attribute of a class instance, and so on. But references also are created by local variables in functions.

Let’s look at an example:

def f():
    obj = {"x": 1}
    g(obj)
    return
    
def g(o):
    print(o)
    return

Let’s say we call f(), and go through the code step by step:

f():
    obj = {"x": 1}  # `obj` increments counter to 1
    g(o=obj):
       # `o` reference increments counter to 2
       print(o)
       return  # `o` goes away, decrements counter to 1
    return # `obj` goes away, decrements counter 0
# Dictionary is freed from memory

In prose form:

We do obj = object(), which means there is a local variable obj pointing to the dictionary we created. That variable, created by running the function, increments the object’s reference counter.
Next we pass that object to g. There is now a local variable called o that is an additional reference to the same dictionary, so the total reference count is 2.
Next we print o, which may or may not add a reference, but once print() returns we have no additional references, and we’re still at 2.
g() returns, which means the local o variable goes away, decrementing the reference count to 1.
Finally, f() returns, the local obj variable goes away, decrementing the reference count again to 0.
The reference count is now 0, and the dictionary can be freed. This also decrements the reference count for the "x" string and the 1 integer we created, modulo some string- and integer-specific optimizations I won’t go into.

Now let’s look at that code again, on a semantic level. Once the dictionary is passed to g(), it will never be used by f() again—and yet, there is still a reference from f() due to the obj variable, which is why the reference count is 2. The local variable’s reference will never go away until f() exits, even though f() is done using it.

Now, keeping a small dictionary in memory for slightly longer isn’t really a problem. But what if that object used a lot of memory?

The extra 1GB

Let’s return to our original code, where we had an unexpected extra 1GB of memory usage. To recap:

# ...

def process_data():
    data = load_1GB_of_data()
    return modify2(modify1(data))

def modify1(data):
    return data * 2

def modify2(data):
    return data + 10

If we profile it with the Fil memory profiler to get allocations at the time of peak memory usage, here’s what we’ll get:

At peak we use 3GB due to three allocations; basically we’re looking at the moment in time when modify2() allocates its modified array:

The original array created by load_1GB_of_data().
The first modified array, created by modify1(); this hangs around until modify2() has finished using it and then gets deallocated.
The second modified array, created by modify2().

The problem is that first allocation: we don’t need it any more once modify1() has created the modified version. But because of the local variable data in process_data(), it is not freed from memory until process_data() returns. And that means memory usage is 1GB higher than it would otherwise be.

Note: Whether or not any particular tool or technique will help depends on where the actual memory bottlenecks are in your software.

Need to identify the memory and performance bottlenecks in your own Python data processing code? Try the Sciagraph profiler, with support for profiling both in development and production macOS and Linux, and with built-in Jupyter support.

Solutions: Making functions let go

Our problem is that process_data() is holding on to the original array for too long:

def process_data():
    data = load_1GB_of_data() # ← `data` var lives too long
    return modify2(modify1(data))

Solutions therefore involve making sure that data local variable doesn’t hold on to the original array for longer than it needs to.

Solution #1: No local variable at all

If there’s no extra reference, the original array can be removed from memory as soon as it’s not used:

# ...

def process_data():
    return modify2(modify1(load_1GB_of_data()))
    
# ...

Now there is no data reference keeping the original 1GB of data alive, and peak memory usage will be 2GB.

Solution #2: Re-use the local variable

We can explicitly replace data with the result of modify1():

# ...

def process_data():
    data = load_1GB_of_data()
    data = modify1(data)
    data = modify2(data)
    return data
    
# ...

Again, we end up with 2GB peak memory, since the original array can be deallocated as soon as modify1() finishes.

Solution #3: Transfer object ownership

This is a trick borrowed from C++: we have an object whose job it is to own the large 1GB chunk of data, and we pass the owner instead of the original object.

# ...

class Owner:
    def __init__(self, data):
        self.data = data
        
def process_data():
    data = Owner(load_1GB_of_data())
    return modify2(modify1(data))
    
def modify1(owned_data):
    data = owned_data.data
    # Remove a reference to original data:
    owned_data.data = None
    return data * 2

# ...

The trick is that process_data() no longer has a reference to the large chunk of data, but rather to the owner—and modify1 then clears/resets the owner once it’s extracted the data it needs.

Tracking object references

In normal code, having objects live a little longer doesn’t matter. But when an object uses multiple gigabyte of RAM, living too long can either make your program run out of memory, or require paying for more hardware.

So get in the habit of mentally tracking where the references to objects are. And if memory usage is too high, and the profiler suggests function-level references are the problem, try one of the techniques above.

Learn even more techniques for reducing memory usage—read the rest of the Larger-than-memory datasets guide for Python.

Find performance and memory bottlenecks in your data processing code with the Sciagraph profiler

Slow-running jobs waste your time during development, impede your users, and increase your compute costs. Speed up your code and you’ll iterate faster, have happier users, and stick to your budget—but first you need to identify the cause of the problem.

Find performance bottlenecks and memory hogs in your data science Python jobs with the Sciagraph profiler. Profile in development and production, with multiprocessing support, on macOS and Linux, with built-in support for Jupyter notebooks.

Speed up your Python code and learn skills you can use at your job

Join over 8000 Python developers and data scientists learning practical tools and techniques every week, from Python performance to Docker packaging, by signing up for my newsletter.