Friday, April 11, 2008

python script as wsgi, cgi, or standalone

EDIT:
See below for original, I realized this could be done cleanly with a decorator.
The decorator wrapplication takes the number of the port to use when called as a standalone server. The EMiddle class is unnecessary, it's just used as middleware to update the environ to show it came via wsgi. If there's a cleaner way, let me know.

#!/usr/bin/python
import os

class EMiddle(object):
def __init__(self, app):
self.app = app
def __call__(self, env, start_response):
env['hello'] = 'wsgi'
return self.app(env, start_response)

def wrapplication(port):
def wrapper(wsgi_app):
if 'TERM' in os.environ:
print "serving on port: %i" % port
os.environ['hello'] = 'standalone'
from wsgiref.simple_server import make_server
make_server('', port, wsgi_app).serve_forever()

elif 'CGI' in os.environ.get('GATEWAY_INTERFACE',''):
os.environ['hello'] = 'cgi'
import wsgiref.handlers
wsgiref.handlers.CGIHandler().run(wsgi_app)
else:
return EMiddle(wsgi_app)
return wrapper

@wrapplication(3000)
def application(environ, start_response):
start_response("200 OK", [('Content-Type', 'text/plain')])
yield "How do you like the teaches of peaches?\n"
yield "from " + environ['hello']




ORIGINAL VERSION:

If you write a script with the application entry point that fits the wsgi spec, it's simple to make it run via mod_wsgi, cgi, or via standalone server depending on the context. I believe this is common knowledge, but for my own reference here's an example with the extra setup (which will work for any script) to do this:


#!/usr/bin/python

def application(environ, start_response):
start_response("200 OK", [('Content-Type', 'text/plain')])
yield environ['QUERY_STRING']

if __name__ == "__main__":
try:
from wsgiref.simple_server import make_server
import sys
port = int(sys.argv[1])
print "server on port: %i" % port
make_server('', port, application).serve_forever()
except Exception, e:

import wsgiref.handlers
wsgiref.handlers.CGIHandler().run(application)

To change between wsgi and cgi toggle between
AddHandler cgi-script .py
AddHandler wsgi-script .py

For the stand alone server, just run it with an argument indicating the port:
python app.py 3000

to use the cherrypy server instead of the wsgiref, replace the make_server() line with:

from cherrypy import wsgiserver
server = wsgiserver.CherryPyWSGIServer(('0.0.0.0', port), [('/', application)], server_name='')
try:
server.start()
except KeyboardInterrupt:
server.stop()

That also handles the server shutdown more politely.

Monday, April 07, 2008

Genedex: query genomic features and sequence

Normally, I don't write libraries, I figure smarter people than I should do such things, and I should just use them. But, I got tired enough of writing one-off scripts for genomic feature manipulation-- find the upstream, downstream neighbors and get the sequence -- and I saw enough of the pieces coming together that I decided to build it. I'd complained before about how rtree didn't support 1D indicies. Not only is this not a problem, it's beneficial. Genomic features should have strand information, so that's the 2nd dimension. Then rtree does containment queries, so it's simple to find only the features on a given strand. I realized this about the same time that the docstring for numpy's memmap went from 0 lines to about 100, and it was enhanced to take a filehandle, not just a filename. This means you can send in a start position and a shape to the numpy.memmap constuctor and it can create a numpy array of only that chunk. This means that it's possible to slice an unaltered fasta file using the numpy array syntax. That's very good.

So, if you put those 2 simple things together, you have the start of something powerful. That's what I did. Then I gave it a crappy name: Genedex (Gendex was taken) and slapped it into googlecode. Check it out: http://code.google.com/p/genedex/. My only design goal was to keep it as simple as possible. If the amount of features is under-whelming, that's good.

TDD


Also, I generally do TDD very half-ass, with asserts and maybe a couple doctests. However, I recently made fairly substantial changes to the SQLite datasource in featureserver, and wrote this set of doctests while doing so. It works! and I've been using it. So, I did what featuresever (presumably crschmidt) devs did and copied the setup for the shapely doctests. It's pretty useful for design, i'd just write out the code for how I wanted the API to look and then implement. The only thing is, for doctests, the way they're used (at least by me) is to copy the output from executing the code into the doctest. So, if your code is wrong to start with, you just copy the wrong answer into the doctest and it's broken but the tests pass. But, at least it's good for regressions, and I just had to remember not to blindly trust the output. That's true for all testing, but especially so for doctests.

So, there's now more tests than code. But, since it's mostly just tie-ing together pieces that do the real work, it's not much code. Doc-tests are also nice because (as the name suggests) they double as documentation. So, here's the genedex documentation:
http://genedex.googlecode.com/svn/trunk/doc/readme.html
It's pretty! It gets colored by pygments, using this script. The only major thing I'd like to add to the library is a plotting class using matplotlib. Then other smaller tasks like a method that takes 2 features and returns the sequence between them.
Any fixes, enhancements, ridicule, etc. will be greeted with commit access.

Sunday, April 06, 2008

comparative genomics with openlayers

Traditional genome browsers, look like this. In fact, I think that's the most popular genome-browser used--gbrowse. They display information in tracks, so any layer of annotation you just add on to the bottom of the image (after making the image taller). This doesnt work for genome-browser, the hack of openlayers to support only horizontal scrolling, because you if you have 2 adjacent tiles, if one has more features than the next, there's not guarantee that they'll be the same height, and no guarantee that a feature that's on both images will align correctly.

I was just hacking around, trying to test some work I'd done and realized that you can have annotation layers with OpenLayers, just add another map, and tie them together!


So that's 2 OpenLayers.Map() instances. What makes this easy is the new Map.panTo() methods in OpenLayers 2.6 (which is in release candidate 1). So, the top map registers for 'move' and 'zoomend' events with callbacks that update the bottom map with the position/zoom of the top map.
That's it! And layers of annotation are available, along with the slippy map. OpenLayers continues to amaze.
That site with the linked maps is here.