A New Type Conversion Mechanism for Boost.Python

By David Abrahams.


This document describes a redesign of the mechanism for automatically converting objects between C++ and Python. The current implementation uses two functions for any type T:
U from_python(PyObject*, type<T>);
void to_python(V);
where U is convertible to T and T is convertible to V. These functions are at the heart of C++/Python interoperability in Boost.Python, so why would we want to change them? There are many reasons:


Firstly, the current mechanism relies on a common C++ compiler bug. This is not just embarrassing: as compilers get to be more conformant, the library stops working. The issue, in detail, is the use of inline friend functions in templates to generate conversions. It is a very powerful, and legal technique as long as it's used correctly:

template <class Derived>
struct add_some_functions
     friend return-type some_function1(..., Derived cv-*-&-opt, ...);
     friend return-type some_function2(..., Derived cv-*-&-opt, ...);

template <class T>
struct some_template : add_some_functions<some_template<T> >
The add_some_functions template generates free functions which operate on Derived, or on related types. Strictly speaking the related types are not just cv-qualified Derived values, pointers and/or references. Section 3.4.2 in the standard describes exactly which types you must use as parameters to these functions if you want the functions to be found (there is also a less-technical description in section 11.5.1 of C++PL3 [1]). Suffice it to say that with the current design, the from_python and to_python functions are not supposed to be callable under any conditions!

Compilation and Linking Time

The conversion functions generated for each wrapped class using the above technique are not function templates, but regular functions. The upshot is that they must all be generated regardless of whether they are actually used. Generating all of those functions can slow down module compilation, and resolving the references can slow down linking.


The conversion functions are primarily used in (member) function wrappers to convert the arguments and return values. Being functions, converters have no interface which allows us to ask "will the conversion succeed?" without calling the function. Since the return value of the function must be the object to be passed as an argument, Boost.Python currently uses C++ exception-handling to detect an unsuccessful conversion. It's not a particularly good use of exception-handling, since the failure is not handled very far from where it occurred. More importantly, it means that C++ exceptions are thrown during overload resolution as we seek an overload that matches the arguments passed. Depending on the implementation, this approach can result in significant slowdowns.

It is also unclear that the current library generates a minimal amount of code for any type conversion. Many of the conversion functions are nontrivial, and partly because of compiler limitations, they are declared inline. Also, we could have done a better job separating the type-specific conversion code from the code which is type-independent.

Cross-module Support

The current strategy requires every module to contain the definition of conversions it uses. In general, a new module can never supply conversion code which is used by another module. Ralf Grosse-Kunstleve designed a clever system which imports conversions directly from one library into another using some explicit declarations, but it has some disadvantages also:
  1. The system Ullrich Koethe designed for implicit conversion between wrapped classes related through inheritance does not currently work if the classes are defined in separate modules.
  2. The writer of the importing module is required to know the name of the module supplying the imported conversions.
  3. There can be only one way to extract any given C++ type from a Python object in a given module.
The first item might be addressed by moving Boost.Python into a shared library, but the other two cannot. Ralf turned the limitation in item two into a feature: the required module is loaded implicitly when a conversion it defines is invoked. We will probably want to provide that functionality anyway, but it's not clear that we should require the declaration of all such conversions. The final item is a more serious limitation. If, for example, new numeric types are defined in separate modules, and these types can all be converted to doubles, we have to choose just one conversion method.


One persistent source of confusion for users of Boost.Python has been the fact that conversions for a class are not be visible at compile-time until the declaration of that class has been seen. When the user tries to expose a (member) function operating on or returning an instance of the class in question, compilation fails...even though the user goes on to expose the class in the same translation unit!

The new system lifts all compile-time checks for the existence of particular type conversions and replaces them with runtime checks, in true Pythonic style. While this might seem cavalier, the compile-time checks are actually not much use in the current system if many classes are wrapped in separate modules, since the checks are based only on the user's declaration that the conversions exist.

The New Design


The new design was heavily influenced by a desire to generate as little code as possible in extension modules. Some of Boost.Python's clients are enormous projects where link time is proportional to the amount of object code, and there are many Python extension modules. As such, we try to keep type-specific conversion code out of modules other than the one the converters are defined in, and rely as much as possible on centralized control through a shared library.

The Basics

The library contains a registry which maps runtime type identifiers (actually an extension of std::type_info which preserves references and constness) to entries containing type converters. An entry can contain only one converter from C++ to Python (wrapper), but many converters from Python to C++ (unwrappers). What should happen if multiple modules try to register wrappers for the same type?. Wrappers and unwrappers are known as body objects, and are accessed by the user and the library (in its function-wrapping code) through corresponding handle (wrap<T> and unwrap<T>) objects. The handle objects are extremely lightweight, and delegate all of their operations to the corresponding body.

When a handle object is constructed, it accesses the registry to find a corresponding body that can convert the handle's constructor argument. Actually the registry record for any type Tused in a module is looked up only once and stored in a static registration<T> object for efficiency. For example, if the handle is an unwrap<Foo&> object, the entry for Foo& is looked up in the registry, and each unwrapper it contains is queried to determine if it can convert the PyObject* with which the unwrap was constructed. If a body object which can perform the conversion is found, a pointer to it is stored in the handle. A body object may at any point store additional data in the handle to speed up the conversion process.

Now that the handle has been constructed, the user can ask it whether the conversion can be performed. All handles can be tested as though they were convertible to bool; a true value indicates success. If the user forges ahead and tries to do the conversion without checking when no conversion is possible, an exception will be thrown as usual. The conversion itself is performed by the body object.

Handling complex conversions

Some conversions may require a dynamic allocation. For example, when a Python tuple is converted to a std::vector<double> const&, we need some storage into which to construct the vector so that a reference to it can be formed. Furthermore, multiple conversions of the same type may need to be "active" simultaneously, so we can't keep a single copy of the storage anywhere. We could keep the storage in the body object, and have the body clone itself in case the storage is used, but in that case the storage in the body which lives in the registry is never used. If the storage was actually an object of the target type (the safest way in C++), we'd have to find a way to construct one for the body in the registry, since it may not have a default constructor.

The most obvious way out of this quagmire is to allocate the object using a new-expression, and store a pointer to it in the handle. Since the body object knows everything about the data it needs to allocate (if any), it is also given responsibility for destroying that data. When the handle is destroyed it asks the body object to tear down any data it may have stored there. In many ways, you can think of the body as a "dynamically-determined vtable" for the handle.

Eliminating Redundancy

If you look at the current Boost.Python code, you'll see that there are an enormous number of conversion functions generated for each wrapped class. For a given class T, functions are generated to extract the following types from_python:
T const*
T const* const&
T* const&
T const&
std::auto_ptr<T> const&
boost::shared_ptr<T> const&
Most of these are implemented in terms of just a few conversions, and if you're lucky, they will be inlined and cause no extra overhead. In the new system, however, a significant amount of data will be associated with each type that needs to be converted. We certainly don't want to register a separate unwrapper object for all of the above types.

Fortunately, much of the redundancy can be eliminated. For example, if we generate an unwrapper for T&, we don't need an unwrapper for T const& or T. Accordingly, the user's request to wrap/unwrap a given type is translated at compile-time into a request which helps to eliminate redundancy. The rules used to unwrap a type are:

  1. Treat built-in types specially: when unwrapping a value or constant reference to one of these, use a value for the target type. It will bind to a const reference if neccessary, and more importantly, avoids having to dynamically allocate room for an lvalue of types which can be cheaply copied.
  2. Reduce everything else to a reference to an un-cv-qualified type where possible. Since cv-qualification is lost on Python anyway, there's no point in trying to convert to a const&. What about conversions to values like the tuple->vector example above? It seems to me that we don't want to make a vector<double>& (non-const) converter available for that case. We may need to rethink this slightly.

To handle the problem described above in item 2, we modify the procedure slightly. To unwrap any non-scalar T, we seek an unwrapper for add_reference<T>::type. Unwrappers for T const& always return T&, and are registered under both T & and T const&.

For compilers not supporting partial specialization, unwrappers for T const& must return T const& (since constness can't be stripped), but a separate unwrapper object need to be registered for T & and T const& anyway, for the same reasons. We may want to make it possible to compile as though partial specialization were unavailable even on compilers where it is available, in case modules could be compiled by different compilers with compatible ABIs (e.g. Intel C++ and MSVC6).

Efficient Argument Conversion

Since type conversions are primarily used in function wrappers, an optimization is provided for the case where a group of conversions are used together. Each handle class has a corresponding "_more" class which does the same job, but has a trivial destructor. Instead of asking each "_more" handle to destroy its own body, it is linked into an endogenous list managed by the first (ordinary) handle. The wrap and unwrap destructors are responsible for traversing that list and asking each body class to tear down its handle. This mechanism is also used to determine if all of the argument/return-value conversions can succeed with a single function call in the function wrapping code. We might need to handle return values in a separate step for Python callbacks, since the availablility of a conversion won't be known until the result object is retrieved.


