Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > My Experiences Subclassing String

Reply
Thread Tools

My Experiences Subclassing String

 
 
Fuzzyman
Guest
Posts: n/a
 
      06-07-2004
I recently went through a bit of a headache trying to subclass
string.... This is because the string is immutable and uses the
mysterious __new__ method rather than __init__ to 'create' a string.
To those who are new to subclassign the built in types, my experiences
might prove helpful. Hopefully not too many innacuracies

I've just spent ages trying to subclass string.... and I'm very proud
to say I finally managed it !

The trouble is that the string type (str) is immutable - which means
that new instances are created using the mysterious __new__ method
rather than __init__ !! You still following me.... ?

SO :

class newstring(str):
def __init__(self, value, othervalue):
str.__init__(self, value)
self.othervalue = othervalue

astring = newstring('hello', 'othervalue')

fails miserably. This is because the __new__ method of the str is
called *before* the __init__ value.... and it says it's been given too
many values. What the __new__ method does is actually return the new
instance - for a string the __init__ method is just a dummy.

The bit I couldn't get (and I didn't have access to a python manual at
the time) - if the __new__ method is responsible for returning the new
instance of the string, surely it wouldn't have a reference to self;
since the 'self' wouldn't be created until after __new__ has been
called......

Actually thats wrong - so, a simple string type might look something
like this :

class newstring(str):
def __new__(self, value):
return str.__new__(self, value)
def __init__(self, value):
pass

See how the __new__ method returns the instance and the __init__ is
just a dummy.
If we want to add the extra attribute we can do this :


class newstring(str):
def __new__(self, value, othervalue):
return str.__new__(self, value)
def __init__(self, value, othervalue):
self.othervalue = othervalue

The order of creation is that the __new__ method is called which
returns the object *then* __init__ is called. Although the __new__
method receives the 'othervalue' it is ignored - and __init__ uses it.
In practise __new__ could probably do all of this - but I prefer to
mess around with __new__ as little as possible ! I was just glad I got
it working..... What it means is that I can create my own class of
objects - that in most situations will behave like strings, but have
their own attributes. The only restriction is that the string value is
immutable and must be set when the object is created. See the
excellent path module by Jason Orendorff for another example object
that behaves like a string but also has other attributes - although it
doesn't use the __new__ method; or the __init__ method I think.

Regards,

Fuzzy

Posted to Voidspace - Techie Blog :
http://www.voidspace.org.uk/voidspace/index.shtml
Experiences used in the python modules at :
http://www.voidspace.org.uk/atlantib...thonutils.html
 
Reply With Quote
 
 
 
 
Paul McGuire
Guest
Posts: n/a
 
      06-07-2004
"Fuzzyman" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) om...
> I recently went through a bit of a headache trying to subclass
> string.... This is because the string is immutable and uses the
> mysterious __new__ method rather than __init__ to 'create' a string.
> To those who are new to subclassign the built in types, my experiences
> might prove helpful. Hopefully not too many innacuracies


<snip>

> The bit I couldn't get (and I didn't have access to a python manual at
> the time) - if the __new__ method is responsible for returning the new
> instance of the string, surely it wouldn't have a reference to self;
> since the 'self' wouldn't be created until after __new__ has been
> called......
>
> Actually thats wrong - so, a simple string type might look something
> like this :
>
> class newstring(str):
> def __new__(self, value):
> return str.__new__(self, value)
> def __init__(self, value):
> pass
>
> See how the __new__ method returns the instance and the __init__ is
> just a dummy.
> If we want to add the extra attribute we can do this :
>
>
> class newstring(str):
> def __new__(self, value, othervalue):
> return str.__new__(self, value)
> def __init__(self, value, othervalue):
> self.othervalue = othervalue
>
> The order of creation is that the __new__ method is called which
> returns the object *then* __init__ is called. Although the __new__
> method receives the 'othervalue' it is ignored - and __init__ uses it.

<snip>

Fuzzy -

I recently went down this rabbit hole while trying to optimize Literal
handling in pyparsing. You are close in your description, but there is one
basic concept that I think still needs to be sorted out for you.

Think of __new__ as a class-level factory method, not an instance method.
That first argument that you passed to your example as 'self' is not the
self instance, it is the class being new'ed. By luck, even though you
called it 'self', you passed it to str.__new__ where the class argument is
supposed to go, so everything still worked.

The canonical/do-nothing __new__ method looks like this:

class A(object):
def __new__(cls,*args):
return object.__new__(cls)

There's nothing stopping you from looking at the args tuple to see if you
want to do more than this, but in truth that's what __init__ is for.

Here's a sample of using __new__ to return a different class of object,
depending on the initialization arguments:

class SpecialA(object):
pass

class A(object):
def __new__(cls,*args):
print cls,":",args
if len(args)>0 and args[0]==2:
return object.__new__(SpecialA)
return object.__new__(cls)

obj = A()
print type(obj)
obj = A(1)
print type(obj)
obj = A(1,"test")
print type(obj)
obj = A(2,"test")
print type(obj)

gives the following output:

<class '__main__.A'> : ()
<class '__main__.A'>
<class '__main__.A'> : (1,)
<class '__main__.A'>
<class '__main__.A'> : (1, 'test')
<class '__main__.A'>
<class '__main__.A'> : (2, 'test')
<class '__main__.SpecialA'>


HTH,
-- Paul


 
Reply With Quote
 
 
 
 
Fuzzyman
Guest
Posts: n/a
 
      06-08-2004
"Paul McGuire" <(E-Mail Removed)._bogus_.com> wrote in message news:<510xc.52456$(E-Mail Removed)>...
[reluctant snip...]

>
> class SpecialA(object):
> pass
>
> class A(object):
> def __new__(cls,*args):
> print cls,":",args
> if len(args)>0 and args[0]==2:
> return object.__new__(SpecialA)
> return object.__new__(cls)
>
> obj = A()
> print type(obj)
> obj = A(1)
> print type(obj)
> obj = A(1,"test")
> print type(obj)
> obj = A(2,"test")
> print type(obj)
>
> gives the following output:
>
> <class '__main__.A'> : ()
> <class '__main__.A'>
> <class '__main__.A'> : (1,)
> <class '__main__.A'>
> <class '__main__.A'> : (1, 'test')
> <class '__main__.A'>
> <class '__main__.A'> : (2, 'test')
> <class '__main__.SpecialA'>
>
>
> HTH,
> -- Paul


Thanks Paul, that was helpful and interesting.
I've posted the following correction to my blog :

Ok... so this is a correction to my post a couple of days ago about
subclassing the built in types (in python).

I *nearly* got it right. Because new is the 'factory method' for
creating new instances it is actually a static method and *doesn't*
receive a reference to self as the first instance... it receives a
reference to the class as the first argument. By convention in python
this is a variable named cls rather than self (which refers to the
instance itself). What it means is that the example I gave *works*
fine, but the terminology is slightly wrong...

See the docs on the new style classes unifying types and classes. Also
thanks to Paul McGuire on comp.lang.pyton for helping me with this.

My example ought to read :
class newstring(str):
def __new__(cls, value, *args, **keywargs):
return str.__new__(cls, value)
def __init__(self, value, othervalue):
self.othervalue = othervalue

See how the __new__ method collects all the other arguments (using the
*args and **keywargs collectors) but ignores them - they are rightly
dealt with by __init__. You *could* examine these other arguments in
__new__ and even return an object that is an instance of a different
class depending on the parameters - see the example Paul gives...

Get all that then ?
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
subclassing std::exception, where to store what() error string? Andrew Tomazos C++ 9 12-02-2011 09:53 PM
Subclassing std::string confusion? petertwocakes C++ 2 11-28-2009 01:03 PM
Subclassing string class Ray C++ 3 07-16-2008 04:00 AM
Subclassing String in C Robin Becker Python 0 11-11-2003 12:36 AM
Subclassing string John Ruby 2 10-21-2003 07:45 AM



Advertisments