Showing posts from April 4, 2010


Disclaimer: This is about a generic indexing method that fits in < 50 lines of code, while it's cool, I'm not suggesting it be used in place of a real project. The code is here and explained below.

Everyone's indexing big sequence files. Peter Cock has a bio-python branch to index via sqlite. Brad Chapman writes up a nice approach using tools from bx-python. And there's my own pyfasta for fasta files. This morning I set out to use another one: screed from the pygr guys to index fastq and fasta files via sqlite db. Titus wrote about screed and posted to the biology-in-python mailng list, which is how i originally heard about it.

Screed and the biopython branch use sqlite to get quickly from name to thing--random access. This is a nice approach because sqlite comes with python and it's easy to use and quite fast. Thinking simply about an index, all it really does is get you from some id (e.g. a fasta header or fastq name) to the thing (the fasta sequence or the fas…

writing and building a lua c extension module

[Update: 2009-04-09]
This package is now on luarocks under it's new name: "stringy". It now includes startswith and endswith methods (more coming). Under advice of lua gurus, I no longer add to the string base methods. And finally, it's available in my lua github repo.
I've been messing with Lua programming language lately, mostly because I wanted to try to use love2d as a visualization tool. I got side-tracked into building a c extension for lua. The C-API is much different from python.
In lua, all the c functions have a signature like:int c_function(lua_State *L)and you use the lua c API to pull values off the lua_state stack thingy -- L.
And then to return values, you just push them back onto the stack. I don't grok this fully yet, but it seems to handle all the memory allocation for you.
Anyway, it's hard to find a full example of creating a C extension for lua 5.1. It actually seems more common just to provide patches for the lua distribution its…