Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Re: fastest way to detect a user type

Reply
Thread Tools

Re: fastest way to detect a user type

 
 
Steven D'Aprano
Guest
Posts: n/a
 
      02-01-2009
Robin Becker wrote:

> Whilst considering a port of old code to python 3 I see that in several
> places we are using type comparisons to control processing of user
> instances (as opposed to instances of built in types eg float, int, str)
>
> I find that the obvious alternatives are not as fast as the current
> code; func0 below. On my machine isinstance seems slower than type for
> some reason. My 2.6 timings are


First question is, why do you care that it's slower? The difference between
the fastest and slowest functions is 1.16-0.33 = 0.83 microsecond. If you
call the slowest function one million times, your code will run less than a
second longer.

Does that really matter, or are you engaged in premature optimization? In
your test functions, the branches all execute "pass". Your real code
probably calls other functions, makes calculations, etc, which will all
take time. Probably milliseconds rather than microseconds. I suspect you're
concerned about a difference of 0.1 of a percent, of one small part of your
entire application. Unless you have profiled your code and this really is a
bottleneck, I recommend you worry more about making your code readable and
maintainable than worrying about micro-optimisations.

Even more important that being readable is being *correct*, and I believe
that your code has some unexpected failure modes (bugs). See below:



> so func 3 seems to be the fastest option for the case when the first
> test matches, but is poor when it doesn't. Can anyone suggest a better
> way to determine if an object is a user instance?
>
> ##############################
> from types import InstanceType


I believe this will go away in Python 3, as all classes will be New Style
classes.


> class X:
> __X__=True


This is an Old Style class in Python 2.x, and a New Style class in Python 3.

Using hasattr('__X__') is a curious way of detecting what you want. I
suppose it could be argued that it is a variety of duck-typing: "if it has
a duck's bill, it must be a duck". (Unless it is a platypus, of course.)
However, attribute names with leading and trailing double-underscores are
reserved for use as "special methods". You should rename it to something
more appropriate: _MAGIC_LABEL, say.


> class V(X):
> pass
>
> def func0(ob):
> t=type(ob)
> if t is InstanceType:
> pass


This test is too broad. It will succeed for *any* old-style class, not just
X and V instances. That's probably not what you want.

It will also fail if ob is an instance of a New Style class. Remember that
in Python 3, all classes become new-style.


> elif t in (float, int):
> pass


This test will fail if ob is a subclass of float or int. That's almost
certainly the wrong behavior. A better way of writing that is:

elif issubclass(t, (float, int)):
pass


> else:
> pass
>
> def func1(ob):
> if isinstance(ob,X):
> pass


If you have to do type checking, that's the recommended way of doing so.



> elif type(ob) in (float, int):
> pass


The usual way to write that is:

if isinstance(ob, (float, int)):
pass



Hope this helps,


--
Steven

 
Reply With Quote
 
 
 
 
Paul Rubin
Guest
Posts: n/a
 
      02-01-2009
Steven D'Aprano <(E-Mail Removed)> writes:
> First question is, why do you care that it's slower? The difference between
> the fastest and slowest functions is 1.16-0.33 = 0.83 microsecond.


That's a 71% speedup, pretty good if you ask me.

> If you call the slowest function one million times, your code will
> run less than a second longer.


What if you call it a billion times, or a trillion times, or a
quadrillion times, you see where this is going? If you're testing
100-digit numbers, there are an awful lot of them before you run out.
 
Reply With Quote
 
 
 
 
Steven D'Aprano
Guest
Posts: n/a
 
      02-01-2009
Paul Rubin wrote:

> Steven D'Aprano <(E-Mail Removed)> writes:
>> First question is, why do you care that it's slower? The difference
>> between the fastest and slowest functions is 1.16-0.33 = 0.83
>> microsecond.

>
> That's a 71% speedup, pretty good if you ask me.


Don't you care that the code is demonstrably incorrect? The OP is
investigating options to use in Python 3, but the fastest method will fail,
because the "type is InstanceType" test will no longer work. (I believe the
fastest method, as given, is incorrect even in Python 2.x, as it will
accept ANY old-style class instead of just the relevant X or V classes.)

That reminds me of something that happened to my wife some years ago: she
was in a van with her band's roadies, and one asked the driver "Are you
sure you know where you're going?", to which the driver replied, "Who
cares? We're making great time." (True story.)

If you're going to accept incorrect code in order to save time, then I can
write even faster code:

def func4(ob):
pass

Trying beating that for speed!


>> If you call the slowest function one million times, your code will
>> run less than a second longer.

>
> What if you call it a billion times, or a trillion times, or a
> quadrillion times, you see where this is going?


It doesn't matter. The proportion of time saved will remain the same. If you
run it a trillion times, you'll save 12 minutes in a calculation that takes
278 hours to run. Big Effing Deal. Saving such trivial amounts of time is
not worth the cost of hard-to-read or incorrect code.

Of course, if you have profiled your code and discovered that *significant*
amounts of time are being used in type-testing, *then* such a
micro-optimization may be worth doing. But I already allowed for that:

"Does that really matter...?"
(the answer could be Yes)

"Unless you have profiled your code and this really is a bottleneck ..."
(it could be)


> If you're testing
> 100-digit numbers, there are an awful lot of them before you run out.


Yes. So what? Once you've tested them, then what? If *all* you are doing
them is testing them, your application is pretty boring. Even a print
statement afterwards is going to take 1000 times longer than doing the
type-test. In any useful application, the amount of time used in
type-testing is almost surely going to be a small fraction of the total
runtime. A 71% speedup on 50% of the runtime is significant; but a 71%
speedup on 0.1% of the total execution time is not.



--
Steven

 
Reply With Quote
 
Steven D'Aprano
Guest
Posts: n/a
 
      02-01-2009
Robin Becker wrote:

> Steven D'Aprano wrote:
>> Paul Rubin wrote:
>>
>>> Steven D'Aprano <(E-Mail Removed)> writes:
>>>> First question is, why do you care that it's slower? The difference
>>>> between the fastest and slowest functions is 1.16-0.33 = 0.83
>>>> microsecond.
>>> That's a 71% speedup, pretty good if you ask me.

>>
>> Don't you care that the code is demonstrably incorrect? The OP is
>> investigating options to use in Python 3, but the fastest method will
>> fail, because the "type is InstanceType" test will no longer work. (I
>> believe the fastest method, as given, is incorrect even in Python 2.x, as
>> it will accept ANY old-style class instead of just the relevant X or V
>> classes.)

>
> I'm not clear why this is true? Not all instances will have the __X__
> attribute or has something else changed in Python3?


The func0() test doesn't look for __X__.


> The original code was intended to be called with only a subset of all
> class instances being passed as argument; as currently written it was
> unsafe because an instance of an arbitrary old class would pass into
> branch 1. Of course it will still be unsafe as arbitrary instances end
> up in branch 3
>
> The intent is to firm up the set of cases being accepted in the first
> branch. The problem is that when all instances are new style then
> there's no easy check for the other acceptable arguments eg float,int,
> str etc,


Of course there is.

isinstance(ob, (float, int))

is the easy, and correct, way to check if ob is a float or int.


> as I see it, the instances must be of a known class or have a
> distinguishing attribute.


Are you sure you need to check for different types in the first place? Just
how polymorphic is your code, really? It's hard to judge because I don't
know what your code actually does.


> As for the timing, when I tried the effect of func1 on our unit tests I
> noticed that it slowed the whole test suite by 0.5%.


An entire half a percent slower. Wow.

That's like one minute versus one minute and 0.3 second. Or one hour, versus
one hour and 18 seconds. I find it very difficult to get worked up over
such small differences. I think you're guilty of premature optimization:
wasting time and energy trying to speed up parts of the code that are
trivial. (Of course I could be wrong, but I doubt it.)



> Luckily func 3
> style improved things by about 0.3% so that's what I'm going for.


I would call that the worst solution. Not only are you storing an attribute
which is completely redundant (instances already know what type they are,
you don't need to manually store a badge on them to mark them as an
instance of a class), but you're looking up this attribute only to
immediately throw away the value you get. The only excuse for this extra
redirection would be if it were significantly faster. But it isn't: you
said it yourself, 0.3% speed up. That's like 60 seconds versus 59.82
seconds.


--
Steven

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Fastest way to detect a non-ASCII character in a list of strings. Dun Peal Python 2 10-18-2010 02:23 AM
fastest way to redirect browser on detection of brower type libsfan01 Javascript 3 08-29-2006 02:57 PM
fastest way to change type yurps ASP .Net 4 04-13-2005 05:06 PM
Fastest 5 mp Digital Camera ? Fastest 4 mp Digital Camera? photoguysept102004@yahoo.com Digital Photography 6 10-28-2004 11:33 AM
correct way to detect container type Robin Becker Python 16 10-08-2004 01:09 PM



Advertisments