Building and testing a hybrid Python/C++ package

UPDATE (2/2/2018): I now think there are better ways to structure a Python package that contains both Python and C++ source code. Please see my follow-up post. I’m leaving this post here for posterity.

For my research, I’ve spent the better part of the last year developing a simulation tool in Python. Python abstracts away things like memory management and type information, making it a great language for working through high-level design decisions. But for simulation software, pure Python is slow. So I’ve taken to a workflow of prototyping in Python and then rewriting portions of the code base into C++ for performance. I’m left with a high-performance Python package that mixes both Python modules and compiled C++-based extension modules. This combination leverages both the simplicity of Python and the efficiency of C++.

Interfacing C++ code with Python has become relatively easy thanks to several libraries. My favorite is pybind11, a header-only C++ library which takes inspiration from Boost.Python but vastly simplifies the syntax. Pybind11 allows you to write Python wrappers (also called bindings) for C++ code and generate a Python extension module with minimal boilerplate code. Going from a module to a proper Python package, however, takes some more work. In my mind, a Python package based on a hybrid Python/C++ code base should include:

The aim of this blog post is to describe how to set up a Python package that meets all of the above requirements with minimal external tools or libraries.

The first part of this post closely follows the pybind11 introductory tutorial and the pybind11/CMake example repository. This post is not meant as an introduction to pybind11. If you’ve never used pybind11 before, it’s worth taking the time now to read through some of the excellent documentation. For Part 1 of this post, I’ll assume a general familiarity with pybind11.

In the second part of this post, I’ll introduce unit testing frameworks for both the Python side and C++ side of our package’s code base. For that, we’ll use Python’s built-in unittest module and the catch C++ library. Lastly, I’ll demonstrate how to stitch these testing frameworks together with Python’s setuptools and setup.py.

All the code presented here has been tested with Python 3.6 and a compiler that supports C++11 on macOS Sierra. C++11 support is required by pybind11, but the Python code and extension modules generated here should work with either Python 2 or 3. The code should also work on Windows, although I have not tested it. All code in this blog post is available as a complete working example in this github repository.

Part 1: A simple pybind11 project

The C++ code

To start, we’ll create a simple pybind11-based Python module. The module will share the same name as our package, python_cpp_example. The directory structure for our package should be familiar to those who have written Python packages before, with a few exceptions, notably lib, build, and CMakeLists.txt:

python_cpp_example/
├── build  # build directory for C++ executables
├── lib  # external C++ libraries
├── python_cpp_example  # source code (Python and C++)
├── setup.py
└── tests  # unit tests (Python and C++)

The build/ directory will contain compiled code generated by our build system. The lib/ directory will contain the C++ libraries needed for our package. Under lib/, download and extract the latest pybind11 release by running the following commands (assuming you’re working in a *nix environment):

wget https://github.com/pybind/pybind11/archive/v2.1.1.tar.gz
tar -xvf v2.1.1.tar.gz
# Copy pybind11 library into our project
cp -r pybind11-2.1.1 python_cpp_example/lib/pybind11

Now we’ll write two simple C++ functions and then wrap them in Python with pybind11. Under python_cpp_example, create three files: math.hpp, math.cpp, and bindings.cpp.

math.hpp and math.cpp are simple C++ header and definition files:

math.hpp:

/*! Add two integers
    \param i an integer
    \param j another integer
*/
int add(int i, int j);
/*! Subtract one integer from another 
    \param i an integer
    \param j an integer to subtract from \p i
*/
int subtract(int i, int j);

math.cpp:

#include "math.hpp"

int add(int i, int j)
{
    return i + j;
}

int subtract(int i, int j)
{
    return i - j;
}

Lastly, we’ll define our Python wrappers in a file called bindings.cpp. See the pybind11 tutorial for a detailed explanation of what each line of code does here.

bindings.cpp:

#include <pybind11/pybind11.h>
#include "math.hpp"

namespace py = pybind11;

PYBIND11_PLUGIN(python_cpp_example)
{
    py::module m("python_cpp_example");
    m.def("add", &add);
    m.def("subtract", &subtract);
    return m.ptr();
}

We’ve now defined two functions in C++, add and subtract, and written the code to wrap them in a Python module called python_cpp_example. Your directory structure should now look something like this:

python_cpp_example/
├── build
├── lib
│   └── pybind11
├── python_cpp_example
│   ├── bindings.cpp
│   ├── math.cpp
│   └── math.hpp
├── setup.py
└── tests

Next we have to set up our build environment. We’ll use CMake to build the C++ extension modules in our package.

Configuring the build environment

We could have written a Makefile directly to build our package, but using CMake simplifies building pybind11-based modules. (If you don’t believe me, check out the Makefile that CMake generates.) Using CMake will also make it easier to add C++ unit tests later.

First make sure that you have CMake installed. On macOS, installation can be done with brew by running the following command:

> brew install cmake

If you are not familiar with CMake, I suggest skimming through the CMake introductory tutorial. We’re going to use a small set of CMake functions here, so even if you’re new to CMake, the code should be easy to follow.

To start, we’ll define a project, set the source directory, and define a list of C++ sources without bindings.cpp. (This list will come in handy later when we want build C++ tests independently of any Python bindings.) Create a CMakeLists.txt file in the package’s root directory and add the following:

cmake_minimum_required(VERSION 2.8.12)
project(python_cpp_example)
# Set source directory
set(SOURCE_DIR "python_cpp_example")
# Tell CMake that headers are also in SOURCE_DIR
include_directories(${SOURCE_DIR})
set(SOURCES "${SOURCE_DIR}/math.cpp")

Next, we’ll tell CMake to add the pybind11 directory to our project and define an extension module. This time, make sure bindings.cpp is added to the sources list. Add the following to CMakeLists.txt:

# Generate Python module
add_subdirectory(lib/pybind11)
pybind11_add_module(python_cpp_example ${SOURCES} "${SOURCE_DIR}/bindings.cpp")

That’s all we need to instruct CMake to build our extension module. Rather than run CMake directly, however, we’re going to configure Python’s built-in setuptools to build our package automatically via setup.py.

Building with setuptools

On its own, setup.py will not build an extension module with a CMake-based build system. We have to define a custom build command. The code I’m presenting here was largely taken from the pybind11’s CMake example repository. I won’t explain every line of the code here, but in brief, we’re defining two classes that will create a temporary build directory and then call CMake to build any extension modules in our package. Add these two class definitions to setup.py:

import os
import re
import sys
import sysconfig
import platform
import subprocess

from distutils.version import LooseVersion
from setuptools import setup, Extension
from setuptools.command.build_ext import build_ext


class CMakeExtension(Extension):
    def __init__(self, name, sourcedir=''):
        Extension.__init__(self, name, sources=[])
        self.sourcedir = os.path.abspath(sourcedir)


class CMakeBuild(build_ext):
    def run(self):
        try:
            out = subprocess.check_output(['cmake', '--version'])
        except OSError:
            raise RuntimeError(
                "CMake must be installed to build the following extensions: " +
                ", ".join(e.name for e in self.extensions))

        if platform.system() == "Windows":
            cmake_version = LooseVersion(re.search(r'version\s*([\d.]+)',
                                         out.decode()).group(1))
            if cmake_version < '3.1.0':
                raise RuntimeError("CMake >= 3.1.0 is required on Windows")

        for ext in self.extensions:
            self.build_extension(ext)

    def build_extension(self, ext):
        extdir = os.path.abspath(
            os.path.dirname(self.get_ext_fullpath(ext.name)))
        cmake_args = ['-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=' + extdir,
                      '-DPYTHON_EXECUTABLE=' + sys.executable]

        cfg = 'Debug' if self.debug else 'Release'
        build_args = ['--config', cfg]

        if platform.system() == "Windows":
            cmake_args += ['-DCMAKE_LIBRARY_OUTPUT_DIRECTORY_{}={}'.format(
                cfg.upper(),
                extdir)]
            if sys.maxsize > 2**32:
                cmake_args += ['-A', 'x64']
            build_args += ['--', '/m']
        else:
            cmake_args += ['-DCMAKE_BUILD_TYPE=' + cfg]
            build_args += ['--', '-j2']

        env = os.environ.copy()
        env['CXXFLAGS'] = '{} -DVERSION_INFO=\\"{}\\"'.format(
            env.get('CXXFLAGS', ''),
            self.distribution.get_version())
        if not os.path.exists(self.build_temp):
            os.makedirs(self.build_temp)
        subprocess.check_call(['cmake', ext.sourcedir] + cmake_args,
                              cwd=self.build_temp, env=env)
        subprocess.check_call(['cmake', '--build', '.'] + build_args,
                              cwd=self.build_temp)
        print()  # Add an empty line for cleaner output

Next, at the bottom of setup.py, modify setup() with the newly-defined custom extension builder:

setup(
    name='python_cpp_example',
    version='0.1',
    author='Benjamin Jack',
    author_email='benjamin.r.jack@gmail.com',
    description='A hybrid Python/C++ test project',
    long_description='',
    # add extension module
    ext_modules=[CMakeExtension('python_cpp_example')],
    # add custom build_ext command
    cmdclass=dict(build_ext=CMakeBuild),
    zip_safe=False,
)

Now you should be able to run python3 setup.py develop from within your package’s root directory and you will see an extension module generated. You can now import and use your new package.

Up until now, I have largely followed along with the pybind11 tutorial and pybind11’s CMake example repository. In Part 2, I will describe how to add unit testing within this set up.

Part 2: Adding unit tests

Writing Python unit tests

We’ll begin by adding Python unit tests using Python’s built-in unittest module. Under tests/ add an empty file called __init__.py. This file will stay empty, but it is required for unittest’s automatic test discovery.

python_cpp_example/
├── CMakeLists.txt
├── build
├── lib
│   └── pybind11
├── python_cpp_example
│   ├── bindings.cpp
│   ├── math.cpp
│   └── math.hpp
├── setup.py
└── tests
    └── __init__.py

In the same tests/ directory, add a file math_test.py with a few simple unit tests.

import unittest
import python_cpp_example  # our `pybind11`-based extension module

class MainTest(unittest.TestCase):
    def test_add(self):
        # test that 1 + 1 = 2
        self.assertEqual(python_cpp_example.add(1, 1), 2)

    def test_subtract(self):
        # test that 1 - 1 = 0
        self.assertEqual(python_cpp_example.subtract(1, 1), 0)

if __name__ == '__main__':
    unittest.main()

That’s all you need for Python unit tests. You can add as many test files as you want, and each one should define a class that extends unittest.TestCase. As long as all of your files have a _test.py suffix, unittest will automatically discover them. Run python3 setup.py test and you should get output that looks like this:

> python3 setup.py test
test_add (tests.math_test.MainTest) ... ok
test_subtract (tests.math_test.MainTest) ... ok

----------------------------------------------------------------------
Ran 2 tests in 0.001s

OK

Python’s built-in unittest module is a powerful unit testing framework with a variety of built in assertions. You can read more about unittest in the official Python documentation.

Writing C++ unit tests with catch

Unlike Python, C++ needs an external library to enable unit testing. I’ve chosen to use catch for its concise syntax and its header-only structure. Download and extract catch in the lib/ directory of your package. On a *nix system, you could run the following:

cd lib
wget https://github.com/philsquared/Catch/archive/v1.9.4.tar.gz
tar -xvf v1.9.4.tar.gz
# Copy catch library into our project
cp -r catch-1.9.4 python_cpp_example/lib/catch 

Similar to Python’s __init__.py, catch requires an initialization file that we’ll name test_main.cpp. This configuration file must contain two specific lines and nothing else. Under the tests/ directory, add the following file:

test_main.cpp:

#define CATCH_CONFIG_MAIN
#include <catch.hpp>

Now we’ll make a file with two simple unit tests. This time I’m using a test_ prefix (rather than a _test.py suffix) to easily distinguish between Python unit tests and C++ unit tests without looking at the file extension. Add the following to a file named test_math.cpp in the test/ directory:

test_math.cpp

#include <catch.hpp>

#include "math.hpp"

TEST_CASE("Addition and subtraction")
{
    REQUIRE(add(1, 1) == 2);
    REQUIRE(subtract(1, 1) == 0);
}

These tests are analogous to the Python tests in the previous section. Normally, I would not unit test both the pybind11 Python wrappers and the underlying C++ definitions for such simple functions. However, you can imagine an instance in which you didn’t want to expose all of your C++ code with Python wrappers, but you still wanted to unit test that C++ code. Likewise, the Python wrappers can get quite complex and it may be useful to test your C++ code independently of the wrapping code.

Your directory structure should now look something like this:

python_cpp_example/
├── CMakeLists.txt
├── LICENSE
├── README.md
├── build
│   └── temp.macosx-10.12-x86_64-3.6
├── lib
│   ├── catch
│   └── pybind11
├── python_cpp_example
│   ├── bindings.cpp
│   ├── math.cpp
│   └── math.hpp
├── python_cpp_example.cpython-36m-darwin.so
├── setup.py
└── tests
    ├── __init__.py
    ├── math_test.py
    ├── test_main.cpp
    └── test_math.cpp

Lastly, we need to instruct CMake that we’ve added C++ unit tests. We’ll add test_main.cpp and test_math.cpp to a TESTS variable. Then we’ll include the catch library and define an executable python_cpp_example_test. Add the following to your CMakeLists.txt file:

SET(TEST_DIR "tests")
SET(TESTS ${SOURCES}
    "${TEST_DIR}/test_main.cpp"
    "${TEST_DIR}/test_math.cpp")

# Generate a test executable
include_directories(lib/catch/include)
add_executable("${PROJECT_NAME}_test" ${TESTS})

Now run python3 ./setup.py develop and if you navigate to build/temp.* (e.g., build/temp.macosx-10.12-x86_64-3.6 on my system), you should see an executable python_cpp_example_test. Running this executable will execute the catch unit tests. Rather than run this executable ourselves, however, we will tell setuptools to run it along side the Python unit tests.

Writing a custom test-runner for setuptools

Just like we wrote a custom extension builder in setup.py in Part 1, we’ll write a custom test runner to run both Python and C++ unit tests. Add the following import command and class definition to your setup.py file:

from setuptools.command.test import test as TestCommand

class CatchTestCommand(TestCommand):
    """
    A custom test runner to execute both Python unittest tests and C++ Catch-
    lib tests.
    """
    def distutils_dir_name(self, dname):
        """Returns the name of a distutils build directory"""
        dir_name = "{dirname}.{platform}-{version[0]}.{version[1]}"
        return dir_name.format(dirname=dname,
                               platform=sysconfig.get_platform(),
                               version=sys.version_info)

    def run(self):
        # Run Python tests
        super(CatchTestCommand, self).run()
        print("\nPython tests complete, now running C++ tests...\n")
        # Run catch tests
        subprocess.call(['./*_test'],
                        cwd=os.path.join('build',
                                         self.distutils_dir_name('temp')),
                        shell=True)

We’re defining a class with two functions. The first function distutils_dir_name is just a helper function that generates a path to the CMake build directory. The second function run, first runs the Python unit tests, then runs the C++ unit tests by executing python_cpp_example_test. This function will run any executable that it finds in the build directory with a _test suffix.

Next define a new test command at the bottom of setup.py.

setup.py:

setup(
    name='python_cpp_example',
    version='0.1',
    author='Benjamin Jack',
    author_email='benjamin.r.jack@gmail.com',
    description='A hybrid Python/C++ test project',
    long_description='',
    ext_modules=[CMakeExtension('python_cpp_example')],
    # add custom test command
    cmdclass=dict(build_ext=CMakeBuild, test=CatchTestCommand),
    zip_safe=False,
)

And now, the moment we’ve been waiting for: a single command to build and test a hybrid Python/C++ package.

> python3 setup.py test
...

test_add (tests.math_test.MainTest) ... ok
test_subtract (tests.math_test.MainTest) ... ok

----------------------------------------------------------------------
Ran 2 tests in 0.001s

OK

Python tests complete, now running C++ tests...

===============================================================================
All tests passed (2 assertions in 1 test case)

Under the hood, the above command is first building any extension modules, executing Python unit tests, and then executing C++ unit tests.

Wrap-up

You should now have a functioning, minimal Python/C++ package that is set up to unit test both the Python and C++ code. For a complete working example, see the github repository that accompanies this blog post. In my next blog post, we’ll learn how to auto-generate documentation for the mixed-language code base presented here.