Writing extension modules for pypy¶
This document tries to explain how to interface the PyPy python interpreter with any external library.
Right now, there are the following possibilities of providing third-party modules for the PyPy python interpreter (in order of usefulness):
- Write them in pure Python and use CFFI.
- Write them in pure Python and use ctypes.
- Write them in C++ and bind them through Reflex.
- Write them in as RPython mixed modules.
CFFI is the recommended way. It is a way to write pure Python code that accesses C libraries. The idea is to support either ABI- or API-level access to C — so that you can sanely access C libraries without depending on details like the exact field order in the C structures or the numerical value of all the constants. It works on both CPython (as a separate pip install cffi) and on PyPy, where it is included by default.
PyPy’s JIT does a quite reasonable job on the Python code that call C functions or manipulate C pointers with CFFI. (As of PyPy 2.2.1, it could still be improved, but is already good.)
See the documentation here.
The goal of the ctypes module of PyPy is to be as compatible as possible with the CPython ctypes version. It works for large examples, such as pyglet. PyPy’s implementation is not strictly 100% compatible with CPython, but close enough for most cases.
We also used to provide ctypes-configure for some API-level access. This is now viewed as a precursor of CFFI, which you should use instead. More (but older) information is available here. Also, ctypes’ performance is not as good as CFFI’s.
PyPy implements ctypes as pure Python code around two built-in modules called _ffi and _rawffi, which give a very low-level binding to the C library libffi. Nowadays it is not recommended to use directly these two modules.
The builtin cppyy module uses reflection information, provided by Reflex (which needs to be installed separately), of C/C++ code to automatically generate bindings at runtime. In Python, classes and functions are always runtime structures, so when they are generated matters not for performance. However, if the backend itself is capable of dynamic behavior, it is a much better functional match, allowing tighter integration and more natural language mappings.
The cppyy module is written in RPython, thus PyPy’s JIT is able to remove most cross-language call overhead.
RPython Mixed Modules¶
This is the internal way to write built-in extension modules in PyPy. It cannot be used by any 3rd-party module: the extension modules are built-in, not independently loadable DLLs.
This is reserved for special cases: it gives direct access to e.g. the details of the JIT, allowing us to tweak its interaction with user code. This is how the numpy module is being developed.