Posts

Showing posts from February 15, 2009

biohash

This is a quick project I did a while back but, I've seen people interested in similar ideas, so I'll post my implementation here. Geohash encodes latitude/longtitude locations into a string such that "nearby places will often present similar prefixes" and the longer the string, the greater the precision. Using this python implementation by Schuyler as a reference, I ported the concept to a "biohash" which can encode intervals. It works in a similar fashion, starting with the extremes and halving the space until it finds the smallest space that contains the interval. The use to allow efficient search of intervals using a BTree index, as in any relational db. It's implemented with only a dumps() and loads() function after the pickle interface. The dumps function takes start and end args and returns a 1/0 encoded string. The loads takes a 1/0 encoded string and returns the tightest interval it can given that string. Both functions take a rng kwarg, whic