Showing posts from January 4, 2009

Stupid things I did as a Bioinformatics Programmer in 2008

In 2008, I was good enough at programming to get my ass kicked by hard problems. I think that's the most positive way to say it.
My main bioinformatics project was a long annotation pipeline. It takes days to run, often using 8 CPUs. It's driven by a big ol' Makefile. I made the mistake of passing data between steps in un-structured text files or python pickles.

I'd create one at the beginning of the pipeline and not notice it was messed up in a way that affected other parts until the entire pipeline was done, days later. Toward the end of the pipeline, I'd need something simple, like the strand of a BLAST hit, but I'd have to parse through an entire GFF file, or load some huge pickle into memory just to get to that. Then I'd need some annotation, and I'd have to add a slow step of doing a lookup in a script that'd otherwise run very quickly.

I was passing around data in arrays and tuples, so then when I changed the order or added another element in…