Velocity Reviews > Function decorator that caches function results

Function decorator that caches function results

=?ISO-8859-1?Q?Lasse_V=E5gs=E6ther_Karlsen?=
Guest
Posts: n/a

 10-08-2005
After working through a fair number of the challenges at
www.mathschallenge.net, I noticed that some long-running functions can
be helped *a lot* by caching their function results and retrieving from
cache instead of calculating again. This means that often I can use a
natural recursive implementation instead of unwinding the recursive
calls to avoid big exponential running-times.

Ok, so I thought, how about creating a decorator that caches the
function results and retrieves them from cache if possible, otherwise it
calls the function and store the value in the cache for the next invokation.

This is what I came up with so far:

def cache_function(fn):
cache = {}
def cached_result(*args, **kwargs):
if args in cache:
return cache[args]
result = fn(*args, **kwargs)
cache[args] = result
return result
return cached_result

Example of usage:

@cache_function
def fibonacci(idx):
if idx in (1, 2):
return 1
else:
return fibonacci(idx-1) + fibonacci(idx-2)

for index in range(1, 101):
print index, fibonacci(index)

this example goes from taking exponential time to run to being near
instantaneous.

However, this:

for index in range(1, 101):
print index, fibonacci(idx = index)

this one uses the kwargs list of arguments, and I'm not sure how to
change my function to take that into account.

Does anyone have any clues as to how to do that?

The solution would have to consider fibonacci(50) and fibonacci(idx =
50) as the same call and thus retrieve the second one from the cache.

--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
(E-Mail Removed)
PGP KeyID: 0x2A42A1C2

Duncan Booth
Guest
Posts: n/a

 10-08-2005
Lasse Vågsæther Karlsen wrote:
> However, this:
>
> for index in range(1, 101):
> print index, fibonacci(idx = index)
>
> this one uses the kwargs list of arguments, and I'm not sure how to
> change my function to take that into account.
>
> Does anyone have any clues as to how to do that?
>
> The solution would have to consider fibonacci(50) and fibonacci(idx =
> 50) as the same call and thus retrieve the second one from the cache.
>

Most of your gain comes from caching the recursive calls, not from caching
the original call.

If the solution doesn't consider keyword arguments equivalent to positional
arguments, it will make *one* call to the real function, and then the
recursive call will still come out of the cache, so the net result will
still be to give you almost all the speedup you wanted.

Unless the results stored in the cache are very large data structures, I
would suggest that you simply store (args,kwargs) as the cache key and
accept the hit that sometime you'll cache the same call multiple times.

jepler@unpythonic.net
Guest
Posts: n/a

 10-08-2005
On Sat, Oct 08, 2005 at 01:53:28PM +0000, Duncan Booth wrote:
> Unless the results stored in the cache are very large data structures, I
> would suggest that you simply store (args,kwargs) as the cache key and
> accept the hit that sometime you'll cache the same call multiple times.

... except that dicts cannot be dict keys

Another 'memoize' decorator uses this to get the key:
kw = kwargs.items()
kw.sort()
key = (args, tuple(kw))
http://aspn.activestate.com/ASPN/Coo.../Recipe/325905

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDR9vXJd01MZaTXX0RAq5+AJ4yVo8MbputurPvObHBdt urrhZ70gCfUpN2
HxoUEMfpxhkFW3FUDVR+Qkg=
=4UHm
-----END PGP SIGNATURE-----

Sam Pointon
Guest
Posts: n/a

 10-08-2005
What about not storing args at all? Something like this:

def cache_function(func, args_list):
cache = {}
def cached_result(*args, **kwargs):
kwargs.update(dict(zip(args_list, args)))
if kwargs in cache:
return cache[kwargs]
result = func(**kwargs)
cache[kwargs] = result
return result
return cached_result

args_list is a list of all the argument names, so that they can be
converted into keyword arguments.

=?ISO-8859-1?Q?Lasse_V=E5gs=E6ther_Karlsen?=
Guest
Posts: n/a

 10-08-2005
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> On Sat, Oct 08, 2005 at 01:53:28PM +0000, Duncan Booth wrote:
>
>>Unless the results stored in the cache are very large data structures, I
>>would suggest that you simply store (args,kwargs) as the cache key and
>>accept the hit that sometime you'll cache the same call multiple times.

>
>
> ... except that dicts cannot be dict keys
>
> Another 'memoize' decorator uses this to get the key:
> kw = kwargs.items()
> kw.sort()
> key = (args, tuple(kw))
> http://aspn.activestate.com/ASPN/Coo.../Recipe/325905
>
> Jeff

Yeah, but as far as I can see it, this one too fails to recognize
situations where the function is called twice with essentially the same
values, except that in one call it uses named arguments:

k1 = fibonacci(100)
k2 = fibonacci(idx = 100)

this is essentially the same call, except the second one uses a named
argument, which means the function will be invoked and a second cache
entry will be stored.

Granted, not a big problem in most such cases, but here's my augmented
function. Bare in mind that I'm 2 weeks into Python so there's bound to
be room for improvement

def cache(fn):
cache = {}
arg_names = inspect.getargspec(fn)[0]
def cached_result(*args, **kwargs):
# If function is called without parameters, call it without
using the cache
if len(args) == 0 and len(kwargs) == 0:
return fn()

# Work out all parameter names and values
values = {}
for i in range(len(args)):
values[arg_names[i]] = args[i]
for key in kwargs:
values[key] = kwargs[key]
key = tuple([(key, value) for (key, value) in
sorted(values.iteritems())])

# Check cache and return cached value if possible
if key in cache:
return cache[key]

# Work out new value, cache it and return it
result = fn(*args, **kwargs)
cache[key] = result
return result

# Return wrapper function
return cached_result

--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
(E-Mail Removed)
PGP KeyID: 0x2A42A1C2

=?ISO-8859-1?Q?Lasse_V=E5gs=E6ther_Karlsen?=
Guest
Posts: n/a

 10-08-2005
Sam Pointon wrote:
> What about not storing args at all? Something like this:
>
> def cache_function(func, args_list):
> cache = {}
> def cached_result(*args, **kwargs):
> kwargs.update(dict(zip(args_list, args)))
> if kwargs in cache:
> return cache[kwargs]
> result = func(**kwargs)
> cache[kwargs] = result
> return result
> return cached_result
>
> args_list is a list of all the argument names, so that they can be
> converted into keyword arguments.
>

I'll take a look at the zip function, didn't know about that one, but
your example also has the problem that dictionaries can't be used as
dictionary keys, but that can be easily solved. I think I can simplify
my solution with some of yours.

The parameter names can be gotten by doing a inspect.getargspec(fn)[0]
so that can be done by the decorator function.

--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
(E-Mail Removed)
PGP KeyID: 0x2A42A1C2

Fredrik Lundh
Guest
Posts: n/a

 10-08-2005
Lasse Vågsæther Karlsen wrote:

> Yeah, but as far as I can see it, this one too fails to recognize
> situations where the function is called twice with essentially the same
> values, except that in one call it uses named arguments:
>
> k1 = fibonacci(100)
> k2 = fibonacci(idx = 100)
>
> this is essentially the same call, except the second one uses a named
> argument, which means the function will be invoked and a second cache
> entry will be stored.

whoever writes code like that deserves to be punished.

I'd say this thread points to a misunderstanding of what keyword arguments
are, and how they should be used. the basic rule is that you shouldn't mix and
match; use positional arguments for things that are documented to be positional
parameters, and keyword arguments for keyword parameters. if you don't,
you'll end up with code that depends on implementation details. and that's
never a good idea...

</F>

=?ISO-8859-1?Q?Lasse_V=E5gs=E6ther_Karlsen?=
Guest
Posts: n/a

 10-08-2005
Lasse Vågsæther Karlsen wrote:
> Sam Pointon wrote:
>
>> What about not storing args at all? Something like this:

<snip>

Ok, here's my updated version:

class cache(object):
def __init__(self, timeout=0):
self.timeout = timeout
self.cache = {}

def __call__(self, fn):
arg_names = inspect.getargspec(fn)[0]
def cached_result(*args, **kwargs):
# Update named arguments with positional argument values
kwargs.update(dict(zip(arg_names, args)))

# Work out key as a tuple of ('argname', value) pairs
key = tuple(sorted(kwargs.items()))

# Check cache and return cached value if possible
if key in self.cache:
(value, last_time) = self.cache[key]
if self.timeout <= 0 or time.time() - last_time <=
self.timeout:
return value

# Work out new value, cache it and return it
result = fn(**kwargs)

self.cache[key] = (result, time.time())
return result

# Return wrapper function
return cached_result

Changed from previous versions:
- converted to class, must use () on decorator now
- added timeout, results will be recalculated when it expires
- timeout=0 means no timeout, results will always be reused
- now handles both positional and keyword arguments

--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
(E-Mail Removed)
PGP KeyID: 0x2A42A1C2

=?ISO-8859-1?Q?Lasse_V=E5gs=E6ther_Karlsen?=
Guest
Posts: n/a

 10-08-2005
Fredrik Lundh wrote:
<snip>
>>k1 = fibonacci(100)
>>k2 = fibonacci(idx = 100)

<snip>
> whoever writes code like that deserves to be punished.
>
> I'd say this thread points to a misunderstanding of what keyword arguments
> are, and how they should be used. the basic rule is that you shouldn't mix and
> match; use positional arguments for things that are documented to be positional

<snip>

I agree completely.
However, it turns out I have to communicate code with and from people
that have a coding standard that dictates using keyword arguments for
all interfaces. Those functions will also benefit from the cache system
as many of them involves database lookups.

In any case, your response gave me another thing that my solution won't
handle so I'm going to just leave it as it is and look at it if I ever
need it for positional arguments, which I honestly don't believe I will.

--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
(E-Mail Removed)
PGP KeyID: 0x2A42A1C2

George Sakkis
Guest
Posts: n/a

 10-08-2005
"Lasse Vågsæther Karlsen" <(E-Mail Removed)> wrote:

> [snip]
>
> Ok, so I thought, how about creating a decorator that caches the
> function results and retrieves them from cache if possible, otherwise it
> calls the function and store the value in the cache for the next invokation.
>
> [snip]

Cool, you re-invented the memoization pattern:
http://en.wikipedia.org/wiki/Memoization
http://aspn.activestate.com/ASPN/sea...ype=Subsection

Yes, it's kinda discouraging that most interesting ideas have already been conceived, implemented
and used by others...<wink>

George