Bioinformatics

Posts

Showing posts from March 9, 2008

rtree: know your nearest neigbhors

March 11, 2008

My computer spends a lot of time looking for neighbors of a given location-- even more so for bio, than for geo. This is what I've learned about the options for doing smarter search so far. SELECT * from ( (SELECT * FROM ((SELECT * FROM feature WHERE start = ? ORDER BY start LIMIT 1)) as u) UNION (SELECT * FROM ((SELECT * FROM feature where stop = ? ORDER BY stop LIMIT 1)) as v) ) as w ORDER BY ABS((start + stop)/2 - ?) LIMIT 1 if you fill in ? with an integer location, that query will return the closest feature most of the time. It's verbose if not ugly, and that's only for 1 dimension. It can return the wrong feature in certain cases.... You have to write it like that in MySQL, because it doesnt support functional indexes , so as soon as you do something like: ORDER BY ABS((start + stop)/2 - ?) it's no longer an indexed search It's a hard problem, even if you're using postgis . And even if you're a postGIS badass . Other than postGIS, there postgres...