Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   puzzled by name binding in local function (http://www.velocityreviews.com/forums/t957306-puzzled-by-name-binding-in-local-function.html)

Ulrich Eckhardt 02-05-2013 03:18 PM

puzzled by name binding in local function
 
Hello Pythonistas!

Below you will find example code distilled from a set of unit tests,
usable with Python 2 or 3. I'm using a loop over a list of parameters to
generate tests with different permutations of parameters. Instead of
calling util() with values 0-4 as I would expect, each call uses the
same parameter 4. What I found out is that the name 'i' is resolved when
Foo.test_1 is called and not substituted inside the for-loop, which
finds the global 'i' left over from the loop. A simple "del i" after the
loop proved this and gave me an according error.

Now, I'm still not sure how to best solve this problem:
* Spell out all permutations is a no-go.
* Testing the different iterations inside a single test, is
inconvenient because I want to know which permutation exactly fails and
which others don't. Further, I want to be able to run just that one
because the tests take time.
* Further, I could generate local test() functions using the current
value of 'i' as default for a parameter, which is then used in the call
to self.util(), but that code is just as non-obviously-to-me correct as
the current code is non-obviously-to-me wrong. I'd prefer something more
stable.


Any other suggestions?

Thank you!

Uli


# example code
from __future__ import print_function
import unittest

class Foo(unittest.TestCase):
def util(self, param):
print('util({}, {})'.format(self, param))

for i in range(5):
def test(self):
self.util(param=i)
setattr(Foo, 'test_{}'.format(i), test)

unittest.main()

Dave Angel 02-05-2013 06:07 PM

Re: puzzled by name binding in local function
 
On 02/05/2013 10:18 AM, Ulrich Eckhardt wrote:
> Hello Pythonistas!
>
> Below you will find example code distilled from a set of unit tests,
> usable with Python 2 or 3. I'm using a loop over a list of parameters to
> generate tests with different permutations of parameters. Instead of
> calling util() with values 0-4 as I would expect, each call uses the
> same parameter 4. What I found out is that the name 'i' is resolved when
> Foo.test_1 is called and not substituted inside the for-loop, which
> finds the global 'i' left over from the loop. A simple "del i" after the
> loop proved this and gave me an according error.
>
> Now, I'm still not sure how to best solve this problem:
> * Spell out all permutations is a no-go.
> * Testing the different iterations inside a single test, is
> inconvenient because I want to know which permutation exactly fails and
> which others don't. Further, I want to be able to run just that one
> because the tests take time.
> * Further, I could generate local test() functions using the current
> value of 'i' as default for a parameter, which is then used in the call
> to self.util(), but that code is just as non-obviously-to-me correct as
> the current code is non-obviously-to-me wrong. I'd prefer something more
> stable.
>
>
> Any other suggestions?
>
> Thank you!
>
> Uli
>
>
> # example code
> from __future__ import print_function
> import unittest
>
> class Foo(unittest.TestCase):
> def util(self, param):
> print('util({}, {})'.format(self, param))
>
> for i in range(5):
> def test(self):
> self.util(param=i)
> setattr(Foo, 'test_{}'.format(i), test)
>
> unittest.main()


There is only one instance of i, so it's not clear what you expect.
Since it's not an argument to test(), it has to be found in the closure
to the function. In this case, that's the global namespace. So each
time the function is called, it fetches that global.

To put it another way, you're storing the same function object 5 times.
If you need to have separate function objects that already know a
value for i, you need to somehow bind the value into the function object.

One way to do it, as you say, is with default parameters. A function's
default parameters are each stored in the object, because they're
defined to be evaluated only once. That's sometimes considered a flaw,
such as when they're volatile, and subsequent calls to the function use
the same value. But in your case, it's a feature, as it provides a
standard place to store values as known at function definition time.

The other way to do it is with functions.partial(). I can't readily
write you sample code, as I haven't messed with it in the case of class
methods, but partial is generally a way to bind one or more values into
the actual object. I also think it's clearer than the default parameter
approach.


Notice that globals may be defined after a function that references
them, which is a way of cross-checking the logic you already discovered.
The names are only looked up when the function is actually called.

This same logic applies to nested functions; the class definition is an
unnecessary complication; of course I understand it's needed for unittest.

The main place where I see this type of problem is in a gui, where
you're defining a callback to be used by a series of widgets, but you
have a value that IS different for each item in the series. You write a
loop much like you did, and discover that the last loop value is the
only one used. The two cures above work, and you can also use lambda
creatively.

--
DaveA

Terry Reedy 02-05-2013 07:27 PM

Re: puzzled by name binding in local function
 
Code examples are Python 3

On 2/5/2013 10:18 AM, Ulrich Eckhardt wrote:

> Below you will find example code distilled from a set of unit tests,
> usable with Python 2 or 3. I'm using a loop over a list of parameters to
> generate tests with different permutations of parameters. Instead of
> calling util() with values 0-4 as I would expect, each call uses the
> same parameter 4. What I found out is that the name 'i' is resolved when
> Foo.test_1 is called


Names* in Python code are resolved when the code is executed.
Function bodies are executed when the function is called.
Ergo, names in function bodies are resolved when the function is called.
This is sometimes called late binding.

* This may exclude keyword names.

Late binding of global names within functions is why the following can
work instead of raising NameError.

>>> def f(): print(x)


>>> x = 3
>>> f()

3

Only the most recent binding of x, at the time of the call matters, as
long as there is one. Does the following really surprise you?

>>> x = 0
>>> def f(): print(x)


>>> x = 3
>>> f()

3

What do you expect this to print?

>>> x = 1
>>> def f1(): print(x)


>>> x = 2
>>> def f2(): print(x)


>>> x = 3
>>> f1(), f2()


Rolling the repeated code into a loop does not magically change the
behavior of def statements.

for i in range(1, 3):
exec('''\
x = {0}
def f{0}(): print(x)'''.format(i))

x = 3
print((f1(), f2()))

This gives *exactly* the same output.

So does this:

from textwrap import dedent

for i in range(1, 3):
exec(dedent('''
x = {0}
def f{0}():
print(x)
'''.format(i)))

x = 3
print((f1(), f2()))


Python does not do text substitution unless you explicit ask it too, as
I did above.

Late binding is also why functions (and methods, such as .__init__) can
call functions (methods) whose definitions follow later in the code, so
don't change that this change ;-).

> and not substituted inside the for-loop,


> Now, I'm still not sure how to best solve this problem:
> * Spell out all permutations is a no-go.
> * Testing the different iterations inside a single test, is
> inconvenient because I want to know which permutation exactly fails and


A good test framework should give specifics as to the failure. The
unittest assertxxx methods do this. In fact, emitting specific messages
is one reason there are so many methods.

The real 'problem' with multiple tests within a test function is that
the first failure ends that group of tests. But this is only a problem
during development when there *are* failures. And it is possible to
write a test function to run all tests and collect multiple error
messages before 'failing' the test.

> which others don't. Further, I want to be able to run just that one
> because the tests take time.


Whether multiple tests are buried within one function or many, running
just one of them will require some editing.

> * Further, I could generate local test() functions using the current
> value of 'i' as default for a parameter, which is then used in the call
> to self.util(), but that code is just as non-obviously-to-me correct as
> the current code is non-obviously-to-me wrong.


LOL. You know the easiest and correct solution, but reject it because it
is not 'obvious' - though it was obvious enough for you to see it.

If one understands that function definition are executable statements
and that their execution is not magically changed by putting them inside
loops, the problem with your code should be obvious. It creates 5
*identical* functions objects. So it should not be surprising that they
behave identically.

> I'd prefer something more stable.


The fact that default arg expressions are evaluated when the function is
defined is quite stable. Ain't gonna change.

> Any other suggestions?


Revise your obvious meter ;-).

> # example code
> from __future__ import print_function
> import unittest
>
> class Foo(unittest.TestCase):
> def util(self, param):
> print('util({}, {})'.format(self, param))
>
> for i in range(5):


> def test(self):
> self.util(param=i)


Executing this n times produces n identical functions. The easy fix is

def test(self, j = i): self.util(param = j)

> setattr(Foo, 'test_{}'.format(i), test)


Another fix that should work: adapt my code above and use exec within a
loop within the class statement itself (and delete setattr).

for i in range(5):
exec(dedent('''
def test_{0}(self):
self.util(param={0})
'''.format(i)))

> unittest.main()


--
Terry Jan Reedy


Ulrich Eckhardt 02-06-2013 10:19 AM

Re: puzzled by name binding in local function
 
Dave and Terry,

Thanks you both for your explanations! I really appreciate the time you
took.

Am 05.02.2013 19:07, schrieb Dave Angel:
> If you need to have separate function objects that already know a
> value for i, you need to somehow bind the value into the function object.
>
> One way to do it, as you say, is with default parameters. A function's
> default parameters are each stored in the object, because they're
> defined to be evaluated only once. That's sometimes considered a flaw,
> such as when they're volatile, and subsequent calls to the function use
> the same value. But in your case, it's a feature, as it provides a
> standard place to store values as known at function definition time.


Yes, that was also the first way I found myself. The reason I consider
this non-obvious is that it creates a function with two parameters (one
with a default) while I only want one with a single parameter. This is
to some extent a bioware problem and/or a matter of taste, both for me
and for the other audience that I'm writing the code for.


> The other way to do it is with functions.partial(). I can't readily
> write you sample code, as I haven't messed with it in the case of class
> methods, but partial is generally a way to bind one or more values into
> the actual object. I also think it's clearer than the default parameter
> approach.


Partial would be clearer, since it explicitly binds the parameters:

import functools

class Foo(object):
def function(self, param):
print('function({}, {})'.format(self, param))
Foo.test = functools.partial(Foo.function, param=1)

f = Foo()
Foo.test(f) # works
f.test() # fails

I guess that Python sees "Foo.test" and since it is not a (nonstatic)
function, it doesn't create a bound method from this. Quoting the very
last sentence in the documentation: "Also, partial objects defined in
classes behave like static methods and do not transform into bound
methods during instance attribute look-up."

The plain-Python version mentioned in the functools documentation does
the job though, so I'll just use that with a fat comment. Also, after
some digging, I found http://bugs.python.org/issue4331, which describes
this issue. There is a comment from Jack Diederich from 2010-02-23 where
he says that using lambda or a function achieves the same, but I think
that this case shows that this is not the case.

I'm also thinking about throwing another aspect in there: Unless you're
using exec(), there is no way to put any variables as constants into the
function, i.e. to enforce early binding instead of the default late
binding. Using default parameters or functools.partial are both just
workarounds with limited applicability. Also, binding the parameters now
instead of later would reduce size and offer a speedup, so it could be a
worthwhile optimization.


> The main place where I see this type of problem is in a gui, where
> you're defining a callback to be used by a series of widgets, but you
> have a value that IS different for each item in the series. You write a
> loop much like you did, and discover that the last loop value is the
> only one used. The two cures above work, and you can also use lambda
> creatively.


Careful, lambda does not work, at least not easily! The problem is that
lambda only creates a local, anonymous function, but any names used
inside this function will only be evaluated when the function is called,
so I'm back at step 1, just with even less obvious code.


Greetings!

Uli



Ulrich Eckhardt 02-07-2013 08:59 AM

Re: puzzled by name binding in local function
 
Heureka!

Am 06.02.2013 15:37, schrieb Dave Angel:
> def myfunc2(i):
> def myfunc2b():
> print ("myfunc2 is using", i)
> return myfunc2b


Earlier you wrote:
> There is only one instance of i, so it's not clear what you expect.
> Since it's not an argument to test(), it has to be found in the
> closure to the function. In this case, that's the global namespace.
> So each time the function is called, it fetches that global.


Actually, the important part missing in my understanding was the full
meaning of "closure" and how it works in Python. After failing to
understand how the pure Python version of functools.partial worked, I
started a little search and found e.g. "closures-in-python"[1], which
was a key element to understanding the whole picture.

Summary: The reason the above or the pure Python version work is that
they use the closure created by a function call to bind the values in.
My version used a loop instead, but the loop body does not create a
closure, so the effective closure is the surrounding global namespace.

:)

Uli


[1] http://ynniv.com/blog/2007/08/closures-in-python.html



All times are GMT. The time now is 01:50 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.