Let's Build Chuck Norris! - Part 5: Python and cffi
Dimitri Merejkowsky
Posted on April 7, 2018
Originally published on my blog.
Note: This is part 5 of the Let’s Build Chuck Norris! series.
Last week we wrote Python bindings for the chucknorris library using ctypes
.
We managed to get some Chuck Norris facts from a Python program.
On the plus side, we did not have to compile anything. Everything was done directly in Python.
There were a few issues, though:
- We had to pass the path to the
libchucknorris.so
shared library toctypes.cdll.LoadLibrary
. - We had to duplicate information about parameters types of the C function inside our Python program and mistakes were easy to made.
An other way to wrap C code in Python is to use a C extension: that is, a Python module written in C. In this case the Python module actually takes the form of a shared library, and is thus loaded by the python
interpreter at runtime.
There are many ways to create a Python C extension, from directly writing the C code (using the Python C API), to generating the C code and then compile it.
In the past, I’ve used tools like boost::python and swig for this task.
I only started using cffi
recently, but I find it easier to use, and, contrary to the above tools, it is compatible with pypy, which is kind of awesome.
First try with cffi
I you browse the documentation you will see that cffi
can be used in several modes. There is a ABI mode and an API mode. The ABI mode resembles the technique we used with ctypes
, because it involves loading chucknorris
as a shared library.
We are going to use the API mode instead, where all the code is generated using the chucknorris.h header, which minimizes the chance of mistakes.
This means we can go back to building chucknorris
as a static library. That way we won’t have to care about the location of the library, and the chucknorris code will be used at compile time.(Sorry for the little detour).
All we have to do is re-run cmake and ninja:
$ cd cpp/ChuckNorris/build/default $ cmake -DBUILD_SHARED_LIBS=OFF ../.. $ ninja [1/7] Building C object CMakeFiles/c_demo.dir/src/main.c.o [2/7] Building CXX object CMakeFiles/chucknorris.dir/src/c_wrapper.cpp.o [3/7] Building CXX object CMakeFiles/cpp_demo.dir/src/main.cpp.o [4/7] Building CXX object CMakeFiles/chucknorris.dir/src/ChuckNorris.cpp.o [5/7] Linking CXX static library lib/libchucknorris.a [6/7] Linking CXX executable bin/c_demo [7/7] Linking CXX executable bin/cpp_demo
Now, let’s write a Python build script for our C extension using the cffi builder:
build_chucknorris.py:
from cffi import FFI
ffibuilder = FFI()
ffibuilder.set_source(
"_chucknorris",
"""
#include <chucknorris.h>
""",
)
ffibuilder.cdef("""
typedef struct chuck_norris chuck_norris_t;
chuck_norris_t* chuck_norris_init(void);
const char* chuck_norris_get_fact(chuck_norris_t*);
void chuck_norris_deinit(chuck_norris_t*);
""")
- We instantiate a FFI object we call
ffibuilder
. - In
ffibuilder.set_source()
we give the builder the name of the C extension:_chucknorris
. It’s common for C extension names to be prefixed with an underscore. - We also give the FFI builder the C code it needs to compile the code it generates. (Here we only need to include the
<chucknorris.h>
header, but in a real project you may add things like macros or additional helper code). - Finally we list the functions and types we want exposed in our C extension as C declarations – directly copy/pasted from the
chucknorris.h
header – and pass them as a string toffibuilder.cdef()
.
Keeping things DRY
“But wait a minute!”, I hear you say. “You said cffi was better than ctypes because we did not have to duplicate information about types, but now you are telling us we still need to copy/paste C declarations inside the call to .cdef()! What gives?”.
Well, it’s true we usually try to keep things DRY when we write code, (DRY meaning “don’t repeat yourself”).
However, when using cffi it does not matter that much. Not following DRY is only dangerous when the duplicated code does not change at the same time and gets out of sync.
Let’s say you break the API of your library (for instance by changing the number of arguments of a C function). If you don’t reflect the change in ffibuilder.def()
, you will get a nice compilation error, instead of a crash or segfault like we experienced with ctypes
.
Adding a setup.py
With that out of the way, let’s add a setup.py
file we can use while developing our bindings, install our code, and re-distribute to others:
setup.py:
from setuptools import setup, find_packages
setup(name="chucknorris",
version="0.1",
description="chucknorris python bindings",
author="Dimitri Merejkowsky",
py_modules=["chucknorris"],
setup_requires=["cffi"],
cffi_modules=["build_chucknorris.py:ffibuilder"],
install_requires=["cffi"],
)
Here’s what the parameters do:
-
py_modules
: the list of Python modules to install. We only got one, calledchucknorris
, that will use the_chucknorris
C extension and expose a more “Pythonic” API. -
setup_requires
: what thesetup.py
script needs in order to build the extension. -
cffi_modules
: the list of Python objects to be called when building the extension. Here it’s theffibuilder
object defined in thebuild_chucknorris.py
file. -
install_requires
: the list of the dependencies of our module once it has been built and installed. We also needcffi
at runtime, not just for compiling the extension.
Finally, we can write the implementation of the chucknorris Python module:
chucknorris.py:
from _chucknorris import lib, ffi
class ChuckNorris:
def __init__ (self):
self._ck = lib.chuck_norris_init()
def get_fact(self):
c_fact = lib.chuck_norris_get_fact(self._ck)
fact_as_bytes = ffi.string(c_fact)
return fact_as_bytes.decode("UTF-8")
def __del__ (self):
lib.chuck_norris_deinit(self.c_ck)
def main():
chuck_norris = ChuckNorris()
print(chuck_norris.get_fact())
if __name__ == " __main__":
main()
- We start by importing code from the
_chucknorris
C extension.lib
contains what has been wrapped – declared withffibuilder.cdef()
– , andffi
contains various cffi helpers. - We hide the
lib.chuck_norris_init()
andlib.chuck_norris_deinit()
under the__init__
and__del__
methods. (Exactly what we did when going from the C++ constructor and destructors to the C functions, but the other way around) - In the
get_fact()
method, we calllib.chuck_norris_get_fact()
.chuck_norris_get_fact()
returns a “C string”, which is just achar*
array that ends with\0
. We pass it toffi.string()
to get abytes
object, suitable for holding this kind of data. And finally, we convert the list of bytes to a real string usingdecode()
. - Finally, when the
chucknorris.py
script is called, we use our nice Python class as if no C code ever existed :)
Running the builder
After installing the cffi
package, we can finally try and build the code:
$ python setup.py build_ext running build_ext generating ./_chucknorris.c ... building '_chucknorris' extension gcc ... -fPIC ... -I/usr/include/python3.6m -c _chucknorris.c -o ./_chucknorris.o _chucknorris.c:493:14: fatal error: chucknorris.h: No such file or directory
What happened?
-
python setup.py build_ext
found out how to use ourffibuilder
object. - It generated some C code in a
_chucknorris.c
file - It started building the
_chucknorris
extension using_chucknorris.c
and the code we passed inffibuilder.set_source()
andffibuilder.cdef()
. - The
ffibuilder.compile()
method knew about our old friend-fPIC
, and about the path to the Python includes (-I/usr/include/python3.6m
), but it could not find thechucknorris.h
header and the compilation failed.
Tweaking the ffibuilder
Clearly the ffibuilder
needs to know about the chucknorris library and the chucknorris include path.
We can pass them directly to the set_source()
method using the extra_objects
and include_dirs
parameters 1.
build_chucknorris.py:
import path
cpp_path = path.Path("../cpp/ChuckNorris").abspath()
cpp_build_path = cpp_path.joinpath("build/default")
ck_lib_path = cpp_build_path.joinpath("lib/libchucknorris.a")
ck_include_path = cpp_path.joinpath("include")
ffibuilder.set_source(
"_chucknorris",
"""
#include <chucknorris.h>
""",
extra_objects=[ck_lib_path],
include_dirs=[ck_include_path],
)
...
Note that we use the wonderful path.py library to handle path manipulations, which we can add to our setup.py
file:
from setuptools import setup
setup(name="chucknorris",
version="0.1",
...
setup_requires=["cffi", "path.py"],
...
)
The missing symbols
Let’s try to build our extension again:
$ python setup.py build_ext running build_ext generating ./_chucknorris.c ... building '_chucknorris' extension gcc ... -fPIC ... -I/usr/include/python3.6m -c _chucknorris.c -o ./_chucknorris.o gcc ... -shared ... -o build/lib.linux-x86_64-3.6/_chucknorris.abi3.so
OK, this works.
Now let’s run python setup.py develop
so that we can import the C extension directly:
$ python setup.py develop ... generating cffi module 'build/temp.linux-x86_64-3.6/_chucknorris.c' already up-to-date ... copying build/lib.linux-x86_64-3.6/_chucknorris.abi3.so -> ...
Note that setup.py develop
takes care of building the extension for us, and is even capable to skip compilation entirely when nothing needs to be rebuilt.
Now let’s run the chucknorris.py
file:
$ python chucknorris.py Traceback (most recent call last): File "chucknorris.py", line 1, in from _chucknorris import lib, ffi ImportError: .../_chucknorris.abi3.so: undefined symbol: _ZNSt8ios_base4InitD1Ev
Damned!
That’s the problem with shared libraries. gcc
happily lets you build a shared library even if there are symbols that are not defined anywhere. It just assumes the missing symbols will be provided sometime before loading the library.
Thus, the only way to make sure a shared library has been properly built is to actually load it from an executable. 2
Again we are faced with the task of guessing the library from the symbol name. Since it looks like a mangled C++ symbol, we can using c++filt
to get a more human-readable name:
$ c++filt _ZNSt8ios_base4InitD1Ev std::ios_base::Init::~Init
Here I happen to know this is a symbol that comes from the c++ runtime library, the library that contains things like the implementation of std::string
.
We can solve the problem by passing the name of the c++
library directly as a libraries
parameter:
ffibuilder.set_source(
"_chucknorris",
"""
#include <chucknorris.h>
""",
extra_objects=[ck_lib_path],
include_dirs=[ck_include_path],
libraries=["stdc++"],
Note: we could also have set the language
parameter to c++
, and invoke the C++ linker when linking _chucknorris.so
, because the C++ linker knows where the c++ runtime library is. 3
Let’s try again:
$ python setup.py develop $ python chucknorris.py ImportError: .../_chucknorris.abi3.so: undefined symbol: sqlite3_close
This one is easier: chucknorris
depends on libsqlite3
, so we have to link with sqlite3
too.
In the CMakeLists.txt we wrote back in part 2, when we were building the cpp_demo
executable, we just called target_link_libraries(cpp_demo chucknorris)
. CMake knew about the dependency from the chucknorris
target to the sqlite3 library and everything worked fine.
But we’re not using the CMake / conan build system here, we are using the Python build system. How can we make them cooperate?
The json generator
Since conan 1.2.0 there is a generator called json
4 we can use to get machine-readable information about dependencies.
Here’s how we can use this json file inside our ffibuilder.
First, let’s add json
to the list of conan generators:
conanfile.txt:
[requires] sqlite3/3.21.0@dmerej/test ... [generators] cmake json
Then, let’s re-run conan install
:
$ cd cpp/python/build/default $ conan install ../.. ... PROJECT: Installing /home/dmerej/src/chucknorris/cpp/ChuckNorris/conanfile.txt ... sqlite3/3.21.0@dmerej/test: Already installed! PROJECT: Generator cmake created conanbuildinfo.cmake PROJECT: Generator json created conanbuildinfo.json ...
This generates a conanbuildinfo.json
file looking like this:
build/default/conanbuildinfo.json:
{
"dependencies": [
{
"version": "3.21.0",
"name": "sqlite3",
"libs": [
"sqlite3",
"pthread",
"dl"
],
"include_paths": [
"/.../.conan/data/sqlite3/.../<id>/include"
],
"lib_paths": [
"/.../.conan/data/sqlite3/.../<id>/lib"
],
}
]
}
Now we can parse the json file and pass the libraries and include paths to the ffibuilder.set_source()
function:
build_chucknorris.py:
cpp_path = path.Path("../cpp/ChuckNorris").abspath()
cpp_build_path = cpp_path.joinpath("build/default")
extra_objects = []
libchucknorris_path = cpp_build_path.joinpath("lib/libchucknorris.a")
extra_objects.append(libchucknorris_path)
include_dirs = []
include_dirs.append(cpp_path.joinpath("include"))
libraries = ["stdc++"]
conan_info = json.loads(cpp_build_path.joinpath("conanbuildinfo.json").text())
for dep in conan_info["dependencies"]:
for lib_name in dep["libs"]:
lib_filename = "lib%s.a" % lib_name
for lib_path in dep["lib_paths"]:
candidate = path.Path(lib_path).joinpath(lib_filename)
if candidate.exists():
extra_objects.append(candidate)
else:
libraries.append(lib_name)
for include_path in dep["include_paths"]:
include_dirs.append(include_path)
ffibuilder.set_source(
"_chucknorris",
"""
#include <chucknorris.h>
""",
extra_objects=extra_objects,
include_dirs=include_dirs,
libraries=libraries,
)
And now everything works as expected:
$ python3 setup.py clean develop $ python chucknorris.py There are no weapons of mass destruction in Iraq, Chuck Norris lives in Oklahoma.
We can even build a pre-compiled wheel that other people can use it without need to compile the chucknorris project themselves:
On the developer machine:
$ python setup.py bdist_wheel … running build_ext … building ‘_chucknorris’ extension … creating ‘dist/chucknorris-0.1-cp36-cp36m-linux_x86_64.whl’ and adding ‘.’ to it …
On an other machine:
$ pip install chucknorris-0.1-cp36-cp36m-linux_x86\_64.whl $ python -c ‘import chucknorris; chucknorris.main()’
For this to work, the other user will need to be on Linux, have a compatible C++ library and the same version of Python, but as far as distribution of binaries on Linux usually go, isn’t this nice?
There’s an entire blog post to be written about distribution of pre-compiled Python binary modules, but enough about Python for now :)
See you next time, where we’ll use everything we learned there and start porting Chuck Norris to Android.
Thanks for reading this far :)
I'd love to hear what you have to say, so please feel free to leave a comment below, or read the feedback page for more ways to get in touch with me.
-
ffibuilder.set_source()
uses the same API as the distutils Extension class. ↩ -
This also means you should really have at least one executable to test every shared library you write, but you already knew that, right? ↩
-
Not sure what the best move is here. If you have an opinion on it, please let me know. PS: I know we could also use static linking, but I’m saving that for the part where we build ChuckNorris on Android. [return] ↩
-
Disclaimer: the json generator feature was added by yours truly. ↩
Posted on April 7, 2018
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.