Bioinformatics

Posts

Showing posts from January 10, 2010

needleman-wunsch global sequence alignment -- updates and optimizations to nwalign

January 10, 2010

I've written previously about nwalign , a python package I wrote to do fast (thanks to cython) global sequence alignment. Over break, I spent some time fixing some bugs and improving performance. Ack It's actually nice to get bug reports for a relatively obscure bit of software like this as it shows it's getting used. Thanks especially to R. Christen for his patience in (repeatedly) showing me places where nwalign was not doing the right thing. Bugs Placement of Gaps Some of the "Bugs" were actually just "unspecified behavior". For example, given input text of "AGEBAMAM" and "AGEBAM", the alignments: AGEBAMAM AGEBAM-- and AGEBAMAM AGEB--AM Have the same score. However, the first version is generally more, um, polite. Previously, the behavior was deterministic, but depended on the length and order of the input sequence. As of version 0.3, nwalign will always put matches together when given a choice between 2 (or more) paths through t...