Friday, October 24, 2008

appengine memcache memoize decorator

[NOTE: see the 2nd comment below about using a tuple as a key. better to just use pickle.dumps]
I've been playing with google appengine lately. I'm working on a fun, pointless side project. Here's what I came up with for a cache decorator that pulls from memcache based on the args, kwargs and function name if no explicit key is given. The code for creating a key from those is from the recipe linked in the docstring.
"""
a decorator to use memcache on google appengine.
optional arguments:
`key`: the key to use for the memcache store
`time`: the time to expiry sent to memcache

if no key is given, the function name, args, and kwargs are
used to create a unique key so that the same function can return
different results when called with different arguments (as
expected).

usage:
NOTE: actual usage is simpler as:
@gaecache()
def some_function():
...

but doctest doesnt seem to like that.

>>> import time

>>> def slow_fn():
... time.sleep(1.1)
... return 2 * 2
>>> slow_fn = gaecache()(slow_fn)

this run take over a second.
>>> t = time.time()
>>> slow_fn(), time.time() - t > 1
(4, True)

this grab from cache in under .01 seconds
>>> t = time.time()
>>> slow_fn(), time.time() - t < .01
(4, True)

modified from
http://code.activestate.com/recipes/466320/
and
http://code.activestate.com/recipes/325905/
"""

from google.appengine.api import memcache
import logging
import pickle

class gaecache(object):
"""
memoize decorator to use memcache with a timeout and an optional key.
if no key is given, the func_name, args, kwargs are used to create a key.
"""
def __init__(self, time=3600, key=None):
self.time = time
self.key = key

def __call__(self, f):
def func(*args, **kwargs):
if self.key is None:
t = (f.func_name, args, kwargs.items())
try:
hash(t)
key = t
except TypeError:
try:
key = pickle.dumps(t)
except pickle.PicklingError:
logging.warn("cache FAIL:%s, %s", args, kwargs)
return f(*args, **kwargs)
else:
key = self.key

data = memcache.get(key)
if data is not None:
logging.info("cache HIT: key:%s, args:%s, kwargs:%s", key, args, kwargs)
return data

logging.warn("cache MISS: key:%s, args:%s, kwargs:%s", key, args, kwargs)
data = f(*args, **kwargs)
memcache.set(key, data, self.time)
return data

func.func_name = f.func_name
return func

3 comments:

Divided Mind said...

Thanks! I was this short of writing something like this myself! Google should supply it in the std library.

thesweeheng said...

It seems that hash(t) will always raise a TypeError since kwargs.items() returns an un-hashable list. It seems more straightforward to just use pickle.dumps()

Also AppEngine's memcache only use the 2nd element if you pass in a tuple as a key. You can test the following memcache code at http://shell.appspot.com:

>>> from google.appengine.api import memcache as m
>>> k=("a","b",1,2,3)
>>> m.set(k, 3.14159)
True
>>> h=("c","b",9,9)
>>> m.get(h)
3.1415899999999999
>>> m.get(k)
3.1415899999999999

Gowtham said...

Thanks for the decorator code.

As thesweeheng mention "hash(t)" always fails as kwargs.items() returns a list. However, if cast the list to a tuple, hash will work perfectly fine.

t = (f.func_name, args, tuple(kwargs.items()))
hash(t)