some python ctypes stuff in Rtree

I've been working with and on the Rtree python module. It's some cool work done by Howard Butler (hobu) (originally with Sean Gillies) to make python bindings for the Spatial Index C++ library by Marios Hadjieleftheriou which provides various tree structures for efficient spatial searches in n-dimensions. Hobu has written a C API for that along with a new ctypes wrapper to that API which appears in Rtree 0.5 and greater. There is some cool ctypes stuff in there which I'm starting to understand.
From the website:

ctypes is a foreign function library for Python. It provides C compatible data types, and allows calling functions in DLLs or shared libraries. It can be used to wrap these libraries in pure Python.

as a simple example of how ctypes works we can pretend there's no atan() in python's math module and access the one from libm in c like this:
import ctypes
libm = ctypes.CDLL('libm.so.6')

# the following 2 lines correspond to the c signature: double atan(double)
libm.atan.argtypes = [ctypes.c_double]
libm.atan.restype = ctypes.c_double

print libm.atan(0.22)
print libm.atan(ctypes.c_double(0.22))

where line 2 tells ctypes how to find the library (have a look at Rtree or shapely source code to see the cross-platform way to do that). lines 4, 5 tell it the input types (argtypes) and return type (restype) respectively, and lines 7, 8 call the c function by way of the ctypes wrapper. Here, it's calling the version of atan with a double precision number. With simple types, you can let ctypes wrap a python value in the type or you can do so explicitly as in the last line.

Things get more interesting with more complicated return types. For c function with a char * return type, e.g. this contrived example:
// does not need to be freed.
char* fn_char(){
char *s = "asdf";
return s;
}

the ctypes invocation -- with 'ccode' being this contrived library as loaded by ctypes.CDLL -- looks like:
ccode.fn_char.restype = ctypes.c_char_p
print ccode.fn_char()

which returns "asdf" as expected and does not leak. (Note that ctypes.c_char_p is "c char pointer" or "char *".) If you get a copy of a char * and are responsible for freeing it's memory, e.g. from the c function:
// needs to be freed.
const char * fn_const_char(){
return (const char * )strdup("asdf");
}

the ctypes for that looks like:

def get_and_free(achar_p):
s = ctypes.string_at(achar_p)
libc.free(achar_p)
return s


ccode.fn_const_char.restype = get_and_free
print ccode.fn_const_char()

where libc is the standard c library defining the function to free() memory. In this case, it takes advantage of the feature that .restype can be a callable which takes the pointer return from the c code. In get_and_free(), ctypes.string_at() turns that pointer address into a python string. Then the char * pointer is free'd, and the python string is returned, and "asdf" is printed as expected.
It's also possible to do more rigorous error checking with errcheck in which case the ctypes looks like:

def err_check(char_p, fn, args, fn, args):
s = ctypes.string_at(char_p)
libc.free(char_p)
return s

ccode.fn_const_char.restype = ctypes.POINTER(ctypes.c_char)
ccode.fn_const_char.errcheck = err_check

where err_check gets 3 arguments, the result of the function, a reference to the function, and the args sent to it (in this case args is empty). Note that in this case, we have to specify the restype ctypes.POINTER(ctypes.c_char) so that we still have the pointer address--which we then free. When the restype is specified as ctypes.c_char_p (char *), then ctypes automatically gives us the python string and we can't (as far as I know) free the memory and a leak occurs. Also, in the case above, I haven't actually added any extra error checking, in Rtree hobu has a few functions to check the return values, see that code here.

This post has been pretty basic, there's also good ctypes code in shapely, geodjango, and libLAS. Next post I'll talk about callback functions -- calling a C function that expects a pointer-to-a-function with a python function as an argument.

Comments

Popular posts from this blog

filtering paired end reads (high throughput sequencing)

python interval tree

needleman-wunsch global sequence alignment -- updates and optimizations to nwalign