When C extensions crash: easier debugging for your test suite

It’s common to use C extensions in Python applications: in order to access pre-existing libraries, or for performance reasons. But unlike Python, the lack of memory safety in C and C++ can lead to crashes—and you’ll need to figure out what caused the crash.

This is extra fun when you get a silent crash half-way through a test run on your CI system:

  • You typically don’t have access to a core file.
  • Lacking good output, you might not even know which test caused the crash.

In this article I’ll cover some ways you can prepare for crashes in advance, so when they do occur you can quickly figure out which part of the codebase caused them:

  1. The standard library’s faulthandler.
  2. Verbose test runs.
  3. Package listing.
  4. catchsegv on Linux.

1. Tracebacks on segfaults with faulthandler

The Python standard library has a handy module called faulthandler that can print a traceback when a segfault occurs—that is, when a C extension crashes (the documentation has a nice example).

All you need to do is call faulthandler.enable() in your code, which by default outputs tracebacks to sys.stderr.

If you’re using py.test to run your tests you can alternatively just install the pytest-faulthandler package: it will enable faulthandler automatically when you use py.test to run tests.

The only caveat is that if the problem involved sufficiently bad memory corruption you won’t be able to get any useful output.

2. Enable detailed reporting of which test is running

Many test runners don’t print which tests are being run by default: you just get a list of dots:

$ py.test
test_precalculate.py ......          [100%]

The problem is that if you crash, and the only thing you have access to is that output, you won’t know exactly where the crash happened: you’ll know the test module, but what if your module has 100 tests, or these are integration tests that can call lots of different codepaths?

So on CI at least make sure you run tests with more detailed reporting, so you know which tests exactly ran. E.g. add the -v flag for py.test:

$ py.test -v
test_precalculate.py::test_created_in_threadpool PASSED
test_precalculate.py::test_destroyed_in_threadpool PASSED
test_precalculate.py::test_precreated PASSED
test_precalculate.py::test_new_create_on_get PASSED

If you crash you will then be able to see which test caused the problem—the last one printed, typically.

3. Dump the installed packages at the start of the CI run

If the crash is in a library, sometimes you’ll start getting crashes because of a minor change in the library version. If your local development machine has different package versions you won’t be able to reproduce the problem.

So unless you’re explicitly pinning specific packages builds (with hashes for pip, or via conda env export for Conda), make sure to print out the packages you’ve installed at the start of each CI run.

That is, before you run your tests, run either pip list (or conda env export if you use Conda) to make sure you know exactly which packages were used in the CI run.

4. Use catchsegv on Linux

catchsegv is a Linux utility that prints a bunch of helpful information when your program segfaults. Again, it shouldn’t have much overhead, so just change your code to run like this:

$ catchsegv py.test

(Thanks to Glyph Lefkowitz for the suggestion.)

Failure is inevitable

Sooner or later something will go wrong—and with just a smidgen of bad luck it will happen in a way that makes it very hard to figure out what exactly crashed.

So don’t wait for crashes to occur before adding this debug output—do it today, and your future self (or coworkers) will thank you.

You might also enjoy:

» Farewell to fsync(): 10× faster database tests with Docker
» Stuck with slow tests? Speed up your feedback loop