Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > overriding __getitem__ for a subclass of dict

Reply
Thread Tools

overriding __getitem__ for a subclass of dict

 
 
Steve Howell
Guest
Posts: n/a
 
      11-15-2009
I ran the following program, and found its output surprising in one
place:

class OnlyAl:
def __getitem__(self, key): return 'al'

class OnlyBob(dict):
def __getitem__(self, key): return 'bob'

import sys; print sys.version

al = OnlyAl()
bob = OnlyBob()

print al['whatever']
al.__getitem__ = lambda key: 'NEW AND IMPROVED AL!'
print al['whatever']

print bob['whatever']
bob.__getitem__ = lambda key: 'a NEW AND IMPROVED BOB seems
impossible'
print bob['whatever']

2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
[GCC 4.3.3]
al
NEW AND IMPROVED AL!
bob
bob

In attempting to change the behavior for bob's dictionary lookup, I am
clearly doing something wrong, or maybe even impossible.

Obviously the examples are contrived, but I am interested on a purely
academic level why setting __getitem__ on bob does not seem to change
the behavior of bob['foo']. Note that OnlyBob subclasses dict;
OnlyAl does not.

On a more practical level, I will explain what I am trying to do.
Basically, I am trying to create some code that allows me to spy on
arbitrary objects in a test environment. I want to write a spy()
method that takes an arbitrary object and overrides its implementation
of __getitem__ and friends so that I can see how library code is
invoking the object (with print statements or whatever). Furthermore,
I want spy() to recursively spy on objects that get produced from my
original object. The particular use case is that I am creating a
context for Django templates, and I want to see which objects are
getting rendered, all the way down the tree. It would be pretty easy
to just create a subclass of the context method to spy at the top
level, but I want to recursively spy on all its children, and that is
why I need a monkeypatching approach. The original version had spy
recursively returning proxy/masquerade objects that intercepted
__getitem__ calls, but it becomes brittle when the proxy objects go
off into places like template filters, where I am not prepared to
intercept all calls to the object, and where in some cases it is
impossible to gain control.

Although I am interested in comments on the general problems (spying
on objects, or spying on Django template rendering), I am most
interested in the specific mechanism for changing the __getitem__
method for a subclass on a dictionary. Thanks in advance!

 
Reply With Quote
 
 
 
 
Gary Herron
Guest
Posts: n/a
 
      11-15-2009
Steve Howell wrote:
> I ran the following program, and found its output surprising in one
> place:
>
> class OnlyAl:
> def __getitem__(self, key): return 'al'
>
> class OnlyBob(dict):
> def __getitem__(self, key): return 'bob'
>
> import sys; print sys.version
>
> al = OnlyAl()
> bob = OnlyBob()
>
> print al['whatever']
> al.__getitem__ = lambda key: 'NEW AND IMPROVED AL!'
> print al['whatever']
>
> print bob['whatever']
> bob.__getitem__ = lambda key: 'a NEW AND IMPROVED BOB seems
> impossible'
> print bob['whatever']
>
> 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
> [GCC 4.3.3]
> al
> NEW AND IMPROVED AL!
> bob
> bob
>


It's the difference between old-style and new-style classes. Type dict
and therefore OnlyBob are new style. OnlyAl defaults to old-style. If
you derive OnlyAl from type object, you'll get consistent results.

Gary Herron



> In attempting to change the behavior for bob's dictionary lookup, I am
> clearly doing something wrong, or maybe even impossible.
>
> Obviously the examples are contrived, but I am interested on a purely
> academic level why setting __getitem__ on bob does not seem to change
> the behavior of bob['foo']. Note that OnlyBob subclasses dict;
> OnlyAl does not.
>
> On a more practical level, I will explain what I am trying to do.
> Basically, I am trying to create some code that allows me to spy on
> arbitrary objects in a test environment. I want to write a spy()
> method that takes an arbitrary object and overrides its implementation
> of __getitem__ and friends so that I can see how library code is
> invoking the object (with print statements or whatever). Furthermore,
> I want spy() to recursively spy on objects that get produced from my
> original object. The particular use case is that I am creating a
> context for Django templates, and I want to see which objects are
> getting rendered, all the way down the tree. It would be pretty easy
> to just create a subclass of the context method to spy at the top
> level, but I want to recursively spy on all its children, and that is
> why I need a monkeypatching approach. The original version had spy
> recursively returning proxy/masquerade objects that intercepted
> __getitem__ calls, but it becomes brittle when the proxy objects go
> off into places like template filters, where I am not prepared to
> intercept all calls to the object, and where in some cases it is
> impossible to gain control.
>
> Although I am interested in comments on the general problems (spying
> on objects, or spying on Django template rendering), I am most
> interested in the specific mechanism for changing the __getitem__
> method for a subclass on a dictionary. Thanks in advance!
>
>


 
Reply With Quote
 
 
 
 
Steve Howell
Guest
Posts: n/a
 
      11-15-2009
On Nov 15, 10:25*am, Steve Howell <showel...@yahoo.com> wrote:
> [see original post...]
> I am most
> interested in the specific mechanism for changing the __getitem__
> method for a subclass on a dictionary. *Thanks in advance!


Sorry for replying to myself, but I just realized that the last
statement in my original post was a little imprecise.

I am more precisely looking for a way to change the behavior of foo
['bar'] (side effects and possibly return value) where "foo" is an
instance of a class that subclasses "dict," and where "foo" is not
created by me. The original post gives more context and example code
that does not work as I expect/desire.

 
Reply With Quote
 
Steve Howell
Guest
Posts: n/a
 
      11-15-2009
On Nov 15, 11:19*am, Gary Herron <gher...@islandtraining.com> wrote:
> Steve Howell wrote:
> > I ran the following program, and found its output surprising in one
> > place:

>
> > * * class OnlyAl:
> > * * * * def __getitem__(self, key): return 'al'

>
> > * * class OnlyBob(dict):
> > * * * * def __getitem__(self, key): return 'bob'

>
> > * * import sys; print sys.version

>
> > * * al = OnlyAl()
> > * * bob = OnlyBob()

>
> > * * print al['whatever']
> > * * al.__getitem__ = lambda key: 'NEW AND IMPROVED AL!'
> > * * print al['whatever']

>
> > * * print bob['whatever']
> > * * bob.__getitem__ = lambda key: 'a NEW AND IMPROVED BOB seems
> > impossible'
> > * * print bob['whatever']

>
> > * * 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
> > * * [GCC 4.3.3]
> > * * al
> > * * NEW AND IMPROVED AL!
> > * * bobe
> > * * bob

>
> It's the difference between old-style and new-style classes. *Type dict
> and therefore OnlyBob are new style. *OnlyAl defaults to old-style. *If
> you derive OnlyAl from type object, you'll get consistent results.
>


Thanks, Gary. My problem is that I am actually looking for the
behavior that the old-style OnlyAl provides, not OnlyBob--allowing me
to override the behavior of al['foo'] and bob['foo']. I (hopefully)
clarified my intent in a follow-up post that was sent before I saw
your reply. Here it is re-posted for convenience of discussion:

"I am more precisely looking for a way to change the behavior of foo
['bar'] (side effects and possibly return value) where "foo" is an
instance of a class that subclasses "dict," and where "foo" is not
created by me."


 
Reply With Quote
 
Jon Clements
Guest
Posts: n/a
 
      11-15-2009
On Nov 15, 7:23*pm, Steve Howell <showel...@yahoo.com> wrote:
> On Nov 15, 10:25*am, Steve Howell <showel...@yahoo.com> wrote:
>
> > [see original post...]
> > I am most
> > interested in the specific mechanism for changing the __getitem__
> > method for a subclass on a dictionary. *Thanks in advance!

>
> Sorry for replying to myself, but I just realized that the last
> statement in my original post was a little imprecise.
>
> I am more precisely looking for a way to change the behavior of foo
> ['bar'] (side effects and possibly return value) where "foo" is an
> instance of a class that subclasses "dict," and where "foo" is not
> created by me. *The original post gives more context and example code
> that does not work as I expect/desire.


[quote from http://docs.python.org/reference/datamodel.html]
For instance, if a class defines a method named __getitem__(), and x
is an instance of this class, then x[i] is roughly equivalent to
x.__getitem__(i) for old-style classes and type(x).__getitem__(x, i)
for new-style classes.
[/quote]

A quick hack could be:

class Al(dict):
def __getitem__(self, key):
return self.spy(key)
def spy(self, key):
return 'Al'

>>> a = Al()
>>> a[3]

'Al'
>>> a.spy = lambda key: 'test'
>>> a[3]

'test'
>>> b = Al()
>>> b[3]

'Al'

Seems to be what you're after anyway...

hth,
Jon.
 
Reply With Quote
 
Steve Howell
Guest
Posts: n/a
 
      11-15-2009
On Nov 15, 12:01*pm, Jon Clements <jon...@googlemail.com> wrote:
> On Nov 15, 7:23*pm, Steve Howell <showel...@yahoo.com> wrote:
>
> > I am more precisely looking for a way to change the behavior of foo
> > ['bar'] (side effects and possibly return value) where "foo" is an
> > instance of a class that subclasses "dict," and where "foo" is not
> > created by me. *The original post gives more context and example code
> > that does not work as I expect/desire.

>
> [quote fromhttp://docs.python.org/reference/datamodel.html]
> For instance, if a class defines a method named __getitem__(), and x
> is an instance of this class, then x[i] is roughly equivalent to
> x.__getitem__(i) for old-style classes and type(x).__getitem__(x, i)
> for new-style classes.
> [/quote]
>
> A quick hack could be:
>
> class Al(dict):
> * def __getitem__(self, key):
> * * return self.spy(key)
> * def spy(self, key):
> * * return 'Al'
>
> >>> a = Al()
> >>> a[3]

> 'Al'
> >>> a.spy = lambda key: 'test'
> >>> a[3]

> 'test'
> >>> b = Al()
> >>> b[3]

>
> 'Al'
>
> Seems to be what you're after anyway...
>


This is very close to what I want, but the problem is that external
code is defining Al, and I do not seem to be able to get this
statement to have any effect:

a.__getitem__ = lambda key: test

How can I change the behavior of a['foo'] without redefining Al?

 
Reply With Quote
 
Steve Howell
Guest
Posts: n/a
 
      11-15-2009
On Nov 15, 12:01*pm, Jon Clements <jon...@googlemail.com> wrote:
> On Nov 15, 7:23*pm, Steve Howell <showel...@yahoo.com> wrote:
>
> > On Nov 15, 10:25*am, Steve Howell <showel...@yahoo.com> wrote:

>
> > > [see original post...]
> > > I am most
> > > interested in the specific mechanism for changing the __getitem__
> > > method for a subclass on a dictionary. *Thanks in advance!

>
> > Sorry for replying to myself, but I just realized that the last
> > statement in my original post was a little imprecise.

>
> > I am more precisely looking for a way to change the behavior of foo
> > ['bar'] (side effects and possibly return value) where "foo" is an
> > instance of a class that subclasses "dict," and where "foo" is not
> > created by me. *The original post gives more context and example code
> > that does not work as I expect/desire.

>
> [quote fromhttp://docs.python.org/reference/datamodel.html]
> For instance, if a class defines a method named __getitem__(), and x
> is an instance of this class, then x[i] is roughly equivalent to
> x.__getitem__(i) for old-style classes and type(x).__getitem__(x, i)
> for new-style classes.
> [/quote]
>


Ok, thanks to Jon and Gary pointing me in the right direction, I think
I can provide an elaborate answer my own question now.

Given an already instantiated instance foo of Foo where Foo subclasses
dict, you cannot change the general behavior of calls of the form foo
[bar]. (Obviously you can change the behavior for specific examples of
bar after instantiation by setting foo['apple'] and foo['banana'] as
needed, but that's not what I mean.)

This may be surprising to naive programmers like myself, given that is
possible to change the behavior of foo.bar() after instantiation by
simply saying "foo.bar = some_method". Also, with old-style classes,
you can change the behavior of foo[bar] by setting foo.__getitem__.
Even in new-style classes, you can change the behavior of
foo.__getitem__(bar) by saying foo.__getitem__ = some_method, but it
is a pointless exercise, since foo.__getitem__ will have no bearing on
the processing of "foo[bar]." Finally, you can define __getitem__ on
the Foo class itself to change how foo[bar] gets resolved, presumably
even after instantiation of foo itself (but this does not allow for
instance-specific behavior).

Here is the difference:

foo.value looks for a definition of value on the instance before
looking in the class hierarchy
foo[bar] can find __getitem__ on foo before looking at Foo and its
superclasses, if Foo is old-style
foo[bar] will only look for __getitem__ in the class hierarchy if
Foo derives from a new-style class

Does anybody have any links that points to the rationale for ignoring
instance definitions of __getitem__ when new-style classes are
involved? I assume it has something to do with performance or
protecting us from our own mistakes?

So now I am still in search of a way to hook into calls to foo[bar]
after foo has been instantiated. It is all test code, so I am not
particularly concerned about safety or future compatibility. I can do
something really gross like monkeypatch Foo class instead of foo
instance and keep track of the ids to decide when to override
behavior, but there must be a simpler way to do this.
 
Reply With Quote
 
MRAB
Guest
Posts: n/a
 
      11-16-2009
Christian Heimes wrote:
> Steve Howell wrote:
>> Does anybody have any links that points to the rationale for ignoring
>> instance definitions of __getitem__ when new-style classes are
>> involved? I assume it has something to do with performance or
>> protecting us from our own mistakes?

>
> Most magic methods are implemented as descriptors. Descriptors only
> looked up on the type to increase the performance of the interpreter and
> to simply the C API. The same is true for other descriptors like
> properties. The interpreter invokes egg.__getitem__(arg) as
> type(egg).__getitem__(egg, arg).
>
>> So now I am still in search of a way to hook into calls to foo[bar]
>> after foo has been instantiated. It is all test code, so I am not
>> particularly concerned about safety or future compatibility. I can do
>> something really gross like monkeypatch Foo class instead of foo
>> instance and keep track of the ids to decide when to override
>> behavior, but there must be a simpler way to do this.

>
> Try this untested code:
>
> class Spam(dict):
> def __getitem__(self, key):
> getitem = self.__dict__.get("__getitem__", dict.__getitem__)
> return getitem(self, key)
>
> Because dict is the most important and speed critical type in Python it
> has some special behaviors. If you are going to overwrite __getitem__ of
> a dict subclass then you have to overwrite all methods that call
> __getitem__, too. These are get, pop, update and setdefault.
>

I wonder whether it's possible to define 2 behaviours, an optimised one
for instances of a class and another non-optimised one for instances of
a subclasses. That would make it easier to subclass built-in classes
without losing their speed.
 
Reply With Quote
 
Steve Howell
Guest
Posts: n/a
 
      11-16-2009
On Nov 15, 4:03*pm, Christian Heimes <li...@cheimes.de> wrote:
> Steve Howell wrote:
> > Does anybody have any links that points to the rationale for ignoring
> > instance definitions of __getitem__ when new-style classes are
> > involved? *I assume it has something to do with performance or
> > protecting us from our own mistakes?

>
> Most magic methods are implemented as descriptors. Descriptors only
> looked up on the type to increase the performance of the interpreter and
> to simply the C API. The same is true for other descriptors like
> properties. The interpreter invokes egg.__getitem__(arg) as
> type(egg).__getitem__(egg, arg).
>


Is the justification along performance lines documented anywhere?

> > So now I am still in search of a way to hook into calls to foo[bar]
> > after foo has been instantiated. *It is all test code, so I am not
> > particularly concerned about safety or future compatibility. *I can do
> > something really gross like monkeypatch Foo class instead of foo
> > instance and keep track of the ids to decide when to override
> > behavior, but there must be a simpler way to do this.

>
> Try this untested code:
>
> class Spam(dict):
> * * def __getitem__(self, key):
> * * * * getitem = self.__dict__.get("__getitem__", dict.__getitem__)
> * * * * return getitem(self, key)
> [...]


Not sure how this helps me, unless I am misunderstanding...

It is the futility of writing lowercase_spam.__getitem__ that is
setting me back. For my use case I do not want to override
__getitem__ for all Spam objects, nor do I even have the option to
modify the Spam class in some cases.


 
Reply With Quote
 
Steve Howell
Guest
Posts: n/a
 
      11-16-2009
On Nov 15, 4:58*pm, Steve Howell <showel...@yahoo.com> wrote:
> On Nov 15, 4:03*pm, Christian Heimes <li...@cheimes.de> wrote:
>
> > Try this untested code:

>
> > class Spam(dict):
> > * * def __getitem__(self, key):
> > * * * * getitem = self.__dict__.get("__getitem__", dict.__getitem__)
> > * * * * return getitem(self, key)
> > [...]

>
> [I originally responded...] Not sure how this helps me, unless I am misunderstanding...
>


Ok, now I get where you were going with the idea. The following code
runs as expected. Even in pure testing mode, I would want to make it
a little more robust, but it illustrates the basic idea that you can
monitor just particular objects by overriding the class method to look
for an attribute on the instance before doing any special processing.

class MyDict(dict):
pass

dict1 = MyDict()
dict1['foo'] = 'bar'

dict2 = MyDict()
dict2['spam'] = 'eggs'

dict3 = MyDict()
dict3['BDFL'] = 'GvR'

def spy(dict):
def mygetitem(self, key):
if hasattr(self, '__SPYING__'):
value = self.__class__.__old_getitem__(self, key)
print 'derefing %s to %s on %s' % (key, value, self)
return value
if not hasattr(dict.__class__, '__HOOKED__'):
setattr(dict.__class__, '__old_getitem__',
dict.__class__.__getitem__)
setattr(dict.__class__, '__getitem__', mygetitem)
setattr(dict.__class__, '__HOOKED__', True)
dict.__SPYING__ = True

dict1['foo'] # not spied yet
spy(dict1) # this changes class and instance
dict1['foo'] # spied
dict2['spam'] # not spied
spy(dict3) # this only changes instance
dict3['BDFL'] # spied
dict2['spam'] # spied

Thanks, Christian!

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
subclass a class in the namespace of the that subclass Trans Ruby 8 10-23-2008 07:24 AM
Problem redefining __getitem__ for str subclass tsm8015@gmail.com Python 3 04-22-2007 04:56 AM
String subclass method returns subclass - bug or feature? S.Volkov Ruby 2 03-12-2006 06:46 PM
subclass has a variable that is subclass of same superclass jstorta Java 3 02-20-2006 08:42 PM
Defining __getitem__() in a class that inherits from (dict) Tobiah Python 3 03-09-2005 05:25 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57