Posts

Showing posts with the label gis

some python ctypes stuff in Rtree

I've been working with and on the Rtree python module. It's some cool work done by Howard Butler (hobu) (originally with Sean Gillies ) to make python bindings for the Spatial Index C++ library by Marios Hadjieleftheriou which provides various tree structures for efficient spatial searches in n-dimensions. Hobu has written a C API for that along with a new ctypes wrapper to that API which appears in Rtree 0.5 and greater. There is some cool ctypes stuff in there which I'm starting to understand. From the website: ctypes is a foreign function library for Python. It provides C compatible data types, and allows calling functions in DLLs or shared libraries. It can be used to wrap these libraries in pure Python. as a simple example of how ctypes works we can pretend there's no atan() in python's math module and access the one from libm in c like this: import ctypes libm = ctypes.CDLL('libm.so.6') # the following 2 lines correspond to the c signature: double...

landsummary.com

One of the things I like the least about my real job, and much of the contract work that I do is that i'm usually the only programmer working on each task. So, it's been very fun to work on a project with Josh Livni ( His writeup ). We got together one afternoon, and by the time we left, we had a reasonable start of what we call landsummary , we've since put in a fair bit of work sprinkled here and there. Josh set up an AWS server--it's nice for me to have fewer sys-admin duties too. What's actually on display is fairly modest. What it does is takes a user-drawn square, circle, or arbitrary polygon, and uses that to summarize the NLCD dataset along with some census data. The things that make it more than lame are that it's very fast, it can easily be extended to summarize any raster dataset, and we have a sorta cool API (not documented) which allows us or anyone to query the data with a WKT Polygon and request a particular service -- currently nlcd, population,...

choosing django

I prefer sqlalchemy and genshi (or mako) and was therefore looking at using turbogears, but I saw a demo of the django admin, and that sold me. Certainly the templating language did not. Before this, I'd only used web.py in my projects. These are the things I've liked/noted: The first and most important: community . (Oddly enough, as I write this there are 666 projects tagged as django. 'turbogears', 'tg2', 'tg' give less than 50 projects combined. Think someone might have already written what you need? yep. Also, a great site: http://www.djangosnippets.org/ , where I've learned a lot just by reading, and saved myself a lot of time, by extending ideas there. And the development is active . Second, django.contrib.* User authentication is simple, and check google-code for various alterations on the theme. admin. This was what first made me decide on django. And now, new-forms admin is in trunk. This gives you a pretty nice CRUD interface for models ...

wherecamp

I agree with every report I've seen. Wherecamp was awesome. I've been telling people, and I'm still sure it's true, that I met exactly zero people who I'd physically seen before. Ordinarily, I avoid meetings, but this is a good format and seems to attract good people. It's fun to meet and work with people who are really into what they do. The talks are less "talky" and more like chat sessions--which is possible when the groups are small. There was also plenty of time to hack, which was the original reason I went. During and after, I learned some simple things which I'm trying to incorporate into my usual workflow: In the shell, background a job with "ctrl + z" then get back to it with %i where i is the number shown in the output from "jobs". That's a trick from jlivni. From crschmidt , I added: alias doctest="nosetests --with-doctest --doctest-extension=.txt" to my .bash_aliases. Which let's me do: doctest tes...

openlayers, genomes and image-maps

In response to Titus' post on using imagemaps for genomic visualization: Why are imagemaps so popular in genomics? As an extreme and unfair comparison, just imagine if http://maps.google.com was an image map. Given a CGI script that can accept a url like &start=1024&stop=2048&chr=3 and return an appropriate image, you can provide a substantial set of tools using openlayers , which is developed by what must be one of the largest and active developer communities in GIS. (Yes, I am an openlayers fan-boy.) You can do that with a small addition to openlayers which I updated a couple weeks ago to OL version 2.6. In that update, I removed > 140 lines of code . So, it's now even less of a change to OL. Maybe when 2.7 comes out, I'll figure out how to provide a patch that allows an extra argument to the OpenLayers.Map constructor that limits panning to the horizontal direction -- in which case genome-browser will cease to exist and only the single file containing ...

numpy to tiff via gdal

EDIT: 7 months later I came back to this and found an error. update in the code below with old line commented out. Rather than venting about a project I've recently decoupled myself from, I'll try to do something constructive... I also posted this to the gispython mailing list, but I've had to figure it out a couple times, so I'll put it here for the record. Given an N * N numpy array, and a bounding box, it's actually fairly simple to make a georeferenced tiff: from osgeo import gdal, gdal_array import numpy from osgeo.gdalconst import GDT_Float64 xsize, ysize = 10, 10 a = numpy.random.random((xsize, ysize)).astype(numpy.float64) xmin, xmax = -121., -119. ymin, ymax = 41., 43. driver = gdal.GetDriverByName('GTiff') # bad: out = driver.Create('a.tiff', a.shape[0], a.shape[1], 1, GDT_Float64) # the args to Create are 'name', xsize, ysize. and .shape[0] is rows, which is y. driver.Create('a.tiff', a.shape[1], a.shape[0], 1, GDT_Float64...

comparative genomics with openlayers

Image
Traditional genome browsers, look like this . In fact, I think that's the most popular genome-browser used--gbrowse. They display information in tracks, so any layer of annotation you just add on to the bottom of the image (after making the image taller). This doesnt work for genome-browser, the hack of openlayers to support only horizontal scrolling, because you if you have 2 adjacent tiles, if one has more features than the next, there's not guarantee that they'll be the same height, and no guarantee that a feature that's on both images will align correctly. I was just hacking around, trying to test some work I'd done and realized that you can have annotation layers with OpenLayers, just add another map, and tie them together! So that's 2 OpenLayers.Map() instances. What makes this easy is the new Map.panTo() methods in OpenLayers 2.6 (which is in release candidate 1). So, the top map registers for 'move' and 'zoomend' events with callbacks th...

featureserver authentication

As the name implies, featureserver serves vector features to various formats from a number of datasources, including OGR -- which means pretty much any vector format. That's extremely powerful. Really. That means, for instance, that when you're working on a really cool project and all anyone wants to know is if they can see it in KML/Google Earth, it's no extra work. Just point them to the REST-ful url like "http://example.com/featureserver/all.kml", and continue working on the cool project. Likewise for all.gml, .atom, etc. And, if you have a project with spatial data, if you put it in a format that featureserver understands, it's displayable, and editable in openlayers. The next thing people want in a web application is some sort of user restrictions. In featureserver, by default, anyone can do any of the CRUD operations on any feature. I've been playing with a soon-to-be-open-sourced PPGIS (apparently the trendy acronym for that is now VGI ) projec...

point partitioning

Image
After you spend all day banging your head against your own problems, sometimes it's just nice to bang it on something else for a bit. This question came through on the postgis mailing list and it seemed like a good diversion. I think it's a very clear description of the problem. To quote: I have around 300,000 points, each with a lat, lon and altitude (also converted to geometry). I need to get a subset of those points, where none of them are within 5m (or some other arbitrary distance) of each other. It doesnt matter which points get picked over another, as long as whatever the data set it creates, that none of the points are within that 5m radius and that relatively most of the points are used So, I hacked up a quick solution. It's probably inefficient -- deleting keys from a dict, and removing entries from an rtree index. But, it's easy to understand, and (without the plotting) it runs in about 2 minutes for the requested 300000 points. When plotting, the image lo...

rtree: know your nearest neigbhors

My computer spends a lot of time looking for neighbors of a given location-- even more so for bio, than for geo. This is what I've learned about the options for doing smarter search so far. SELECT * from ( (SELECT * FROM ((SELECT * FROM feature WHERE start = ? ORDER BY start LIMIT 1)) as u) UNION (SELECT * FROM ((SELECT * FROM feature where stop = ? ORDER BY stop LIMIT 1)) as v) ) as w ORDER BY ABS((start + stop)/2 - ?) LIMIT 1 if you fill in ? with an integer location, that query will return the closest feature most of the time. It's verbose if not ugly, and that's only for 1 dimension. It can return the wrong feature in certain cases.... You have to write it like that in MySQL, because it doesnt support functional indexes , so as soon as you do something like: ORDER BY ABS((start + stop)/2 - ?) it's no longer an indexed search It's a hard problem, even if you're using postgis . And even if you're a postGIS badass . Other than postGIS, there postgres...

What a Shapely genome you have!

Image
This might be a case of if you have a really cool hammer , everything looks like a nail, but it was fun mixing tools from different disciplines. After finding synteny , there's a bunch of paired genes whose neighbors are also pairs. Paired ( homologous ) genes have similar sequence because they have some function and can't change without loss of function. Non-gene sequence between the paired genes is mostly randomized via mutation, deletion, etc. But, there is non-gene sequence that is conserved between the genes. These CNS's-- conserved non-coding sequences--are usually sites that bind stuff that regulates the expression of a gene. That looks like this. With one gene on the top, and its pair below, both yellow. Pink lines in the foreground connect putative CNSs (similar sequences) between these genes. That the lines cross is bad. CNSs occur right at the level of noise. So even though a similar sequence occurs near both genes, it could be by chance. It is possible to red...

flash, vi, fcsh

All my flash tinkering has been in VIM-- no IDE, no XML, just actionscript. It's a little tough to deal with the adobe compiler as it takes about 11 seconds to compile a large project like modestmaps on my machine. That's not good for a guess-and-check programmer. The typing does catch some errors. The worldkit project compiles instantaneously with mtasc (the predecessor to haxe )--likewise for the as2 branch of modestmaps. The flash compiler shell drops the compile time for as3 modestmaps to under 3 seconds, so I've added this to my .bash_aliases: alias fcsh="/usr/bin/rlwrap /opt/src/flex2/bin/fcsh" the rlwrap is to use readline in the flash shell--meaning I can just press up-arrow to get the previous compile command. By default, one has to paste or type the entire command again. With that, it's close to a reasonable workflow.

open source gis and flash maps part two

Mash Flap I started looking into flash mapping stuff lately. For the patch I submitted to worldkit, Mikel gave me commit access! So, now the svn version of worldkit can be compiled with mtasc by typing "make". I feel unreasonably proud of that, given that much of what I did was some global search and replace stuff in VI, and then read and fixed mtasc compiler errors until they went away. It was good fun. Michal Migurski saw in that same post that I mentioned modestmaps and gave me some good ideas on getting WMS going. I just figured out how to get that working and posted a message to their overly web2.0 forums. Hopefully someone with some real actionscript skillz will clean it up. A mapping library without a good WMS interface is much less useful for most of the stuff I do. I haven't decided whether to use modestmaps or worldkit, or both. The time stuff the Mikel has done in worldkit is very cool and I haven't really looked at that yet. But I have a tim...