Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Performance on local constants?

Reply
Thread Tools

Performance on local constants?

 
 
William McBrine
Guest
Posts: n/a
 
      12-22-2007
Hi all,

I'm pretty new to Python (a little over a month). I was wondering -- is
something like this:

s = re.compile('whatever')

def t(whatnot):
return s.search(whatnot)

for i in xrange(1000):
print t(something[i])

significantly faster than something like this:

def t(whatnot):
s = re.compile('whatever')
return s.search(whatnot)

for i in xrange(1000):
result = t(something[i])

? Or is Python clever enough to see that the value of s will be the same
on every call, and thus only compile it once?

--
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 -- pass it on
 
Reply With Quote
 
 
 
 
Paddy
Guest
Posts: n/a
 
      12-22-2007
On Dec 22, 10:53 am, William McBrine <(E-Mail Removed)> wrote:
> Hi all,
>
> I'm pretty new to Python (a little over a month). I was wondering -- is
> something like this:
>
> s = re.compile('whatever')
>
> def t(whatnot):
> return s.search(whatnot)
>
> for i in xrange(1000):
> print t(something[i])
>
> significantly faster than something like this:
>
> def t(whatnot):
> s = re.compile('whatever')
> return s.search(whatnot)
>
> for i in xrange(1000):
> result = t(something[i])
>
> ? Or is Python clever enough to see that the value of s will be the same
> on every call, and thus only compile it once?
>
> --
> 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 -- pass it on


Python RE's do have a cache but telling it to compile multiple times
is going to take time.

Best to do as the docs say and compile your RE's once before use if
you can.

The timeit module: http://www.diveintopython.org/perfor...ng/timeit.html
will allow you to do your own timings.

- Paddy.
 
Reply With Quote
 
 
 
 
John Machin
Guest
Posts: n/a
 
      12-22-2007
On Dec 22, 9:53 pm, William McBrine <(E-Mail Removed)> wrote:
> Hi all,
>
> I'm pretty new to Python (a little over a month). I was wondering -- is
> something like this:
>
> s = re.compile('whatever')
>
> def t(whatnot):
> return s.search(whatnot)
>
> for i in xrange(1000):
> print t(something[i])
>
> significantly faster than something like this:
>
> def t(whatnot):
> s = re.compile('whatever')
> return s.search(whatnot)
>
> for i in xrange(1000):
> result = t(something[i])
>
> ?


No.

Or is Python clever enough to see that the value of s will be the same
> on every call,


No. It doesn't have a crystal ball.

> and thus only compile it once?


But it is smart enough to maintain a cache, which achieves the desired
result.

Why don't you do some timings?

While you're at it, try this:

def t2(whatnot):
return re.search('whatever', whatnot)

and this:

t3 = re.compile('whatever').search

HTH,
John
 
Reply With Quote
 
Duncan Booth
Guest
Posts: n/a
 
      12-22-2007
William McBrine <(E-Mail Removed)> wrote:

> Hi all,
>
> I'm pretty new to Python (a little over a month). I was wondering -- is
> something like this:
>
> s = re.compile('whatever')
>
> def t(whatnot):
> return s.search(whatnot)
>
> for i in xrange(1000):
> print t(something[i])
>
> significantly faster than something like this:
>
> def t(whatnot):
> s = re.compile('whatever')
> return s.search(whatnot)
>
> for i in xrange(1000):
> result = t(something[i])
>
> ? Or is Python clever enough to see that the value of s will be the same
> on every call, and thus only compile it once?
>


The best way to answer these questions is always to try it out for
yourself. Have a look at 'timeit.py' in the library: you can run
it as a script to time simple things or import it from longer scripts.

C:\Python25>python lib/timeit.py -s "import re;s=re.compile('whatnot')" "s.search('some long string containing a whatnot')"
1000000 loops, best of 3: 1.05 usec per loop

C:\Python25>python lib/timeit.py -s "import re" "re.compile('whatnot').search('some long string containing a whatnot')"
100000 loops, best of 3: 3.76 usec per loop

C:\Python25>python lib/timeit.py -s "import re" "re.search('whatnot', 'some long string containing a whatnot')"
100000 loops, best of 3: 3.98 usec per loop

So it looks like it takes a couple of microseconds overhead if you
don't pre-compile the regular expression. That could be significant
if you have simple matches as above, or irrelevant if the match is
complex and slow.

You can also try measuring the compile time separately:

C:\Python25>python lib/timeit.py -s "import re" "re.compile('whatnot')"
100000 loops, best of 3: 2.36 usec per loop

C:\Python25>python lib/timeit.py -s "import re" "re.compile('<(?|div)[^>]*>(?P<pat0>(??P<atag0>\\<a[^>]*\\>)\\<img[^>]+class\\s*=[^=>]*captioned[^>]+\\>\\</a\\>)|\\<img[^>]+class\\s*=[^=>]*captioned[^>]+\\>)</(?|div)>|(?P<pat1>(??P<atag1>\\<a[^>]*\\>)\\<img[^>]+class\\s*=[^=>]*captioned[^>]+\\>\\</a\\>)|\\<img[^>]+class\\s*=[^=>]*captioned[^>]+\\>)')"
100000 loops, best of 3: 2.34 usec per loop

It makes no difference whether you use a trivial regular expression
or a complex one: Python remembers (if I remember correctly) the last
100 expressions it compiled,so the compilation overhead will be pretty
constant.
 
Reply With Quote
 
Steven D'Aprano
Guest
Posts: n/a
 
      12-22-2007
On Sat, 22 Dec 2007 10:53:39 +0000, William McBrine wrote:

> Hi all,
>
> I'm pretty new to Python (a little over a month). I was wondering -- is
> something like this:
>
> s = re.compile('whatever')
>
> def t(whatnot):
> return s.search(whatnot)
>
> for i in xrange(1000):
> print t(something[i])
>
> significantly faster than something like this:
>
> def t(whatnot):
> s = re.compile('whatever')
> return s.search(whatnot)
>
> for i in xrange(1000):
> result = t(something[i])
>
> ? Or is Python clever enough to see that the value of s will be the same
> on every call, and thus only compile it once?



Let's find out:


>>> import re
>>> import dis
>>>
>>> def spam(x):

.... s = re.compile('nobody expects the Spanish Inquisition!')
.... return s.search(x)
....
>>> dis.dis(spam)

2 0 LOAD_GLOBAL 0 (re)
3 LOAD_ATTR 1 (compile)
6 LOAD_CONST 1 ('nobody expects the Spanish
Inquisition!')
9 CALL_FUNCTION 1
12 STORE_FAST 1 (s)

3 15 LOAD_FAST 1 (s)
18 LOAD_ATTR 2 (search)
21 LOAD_FAST 0 (x)
24 CALL_FUNCTION 1
27 RETURN_VALUE



No, the Python compiler doesn't know anything about regular expression
objects, so it compiles a call to the RE engine which is executed every
time the function is called.

However, the re module keeps its own cache, so in fact the regular
expression itself may only get compiled once regardless.

Here's another approach that avoids the use of a global variable for the
regular expression:

>>> def spam2(x, s=re.compile('nobody expects the Spanish Inquisition!')):

.... return s.search(x)
....
>>> dis.dis(spam2)

2 0 LOAD_FAST 1 (s)
3 LOAD_ATTR 0 (search)
6 LOAD_FAST 0 (x)
9 CALL_FUNCTION 1
12 RETURN_VALUE

What happens now is that the regex is compiled by the RE engine once, at
Python-compile time, then stored as the default value for the argument s.
If you don't supply another value for s when you call the function, the
default regex is used. If you do, the over-ridden value is used instead:

>>> spam2("nothing")
>>> spam2("nothing", re.compile('thing'))

<_sre.SRE_Match object at 0xb7c29c28>


I suspect that this will be not only the fastest solution, but also the
most flexible.



--
Steven
 
Reply With Quote
 
Dustan
Guest
Posts: n/a
 
      12-22-2007
On Dec 22, 6:04 am, John Machin <(E-Mail Removed)> wrote:
> t3 = re.compile('whatever').search


Ack! No! Too Pythonic! GETITOFF! GETITOFF!!
 
Reply With Quote
 
Terry Reedy
Guest
Posts: n/a
 
      12-22-2007

"Steven D'Aprano" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
| >>> def spam2(x, s=re.compile('nobody expects the Spanish
Inquisition!')):
| ... return s.search(x)
|
| I suspect that this will be not only the fastest solution, but also the
| most flexible.

'Most flexible' in a different way is

def searcher(rex):
crex = re.compile(rex)
def _(txt):
return crex.search(txt)
return _

One can then create and keep around multiple searchers based on different
patterns, to be used as needed.

tjr



 
Reply With Quote
 
John Machin
Guest
Posts: n/a
 
      12-22-2007
On Dec 23, 5:38 am, "Terry Reedy" <(E-Mail Removed)> wrote:
> "Steven D'Aprano" <(E-Mail Removed)> wrote in message
>
> news:(E-Mail Removed)...
> | >>> def spam2(x, s=re.compile('nobody expects the Spanish
> Inquisition!')):
> | ... return s.search(x)
> |
> | I suspect that this will be not only the fastest solution, but also the
> | most flexible.
>
> 'Most flexible' in a different way is
>
> def searcher(rex):
> crex = re.compile(rex)
> def _(txt):
> return crex.search(txt)
> return _
>


I see your obfuscatory ante and raise you several dots and
underscores:

class Searcher(object):
def __init__(self, rex):
self.crex = re.compile(rex)
def __call__(self, txt):
return self.crex.search(txt)

Cheers,
John

 
Reply With Quote
 
Terry Reedy
Guest
Posts: n/a
 
      12-23-2007

"John Machin" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
| On Dec 23, 5:38 am, "Terry Reedy" <(E-Mail Removed)> wrote:
| > 'Most flexible' in a different way is
| >
| > def searcher(rex):
| > crex = re.compile(rex)
| > def _(txt):
| > return crex.search(txt)
| > return _
| >
|
| I see your obfuscatory ante and raise you several dots and
| underscores:

I will presume you are merely joking, but for the benefit of any beginning
programmers reading this, the closure above is a standard functional idiom
for partial evaluation of a function (in this this, re.search(crex,txt))

| class Searcher(object):
| def __init__(self, rex):
| self.crex = re.compile(rex)
| def __call__(self, txt):
| return self.crex.search(txt)

while this is, the equivalent OO version. Intermdiate Python programmers
should know both.

tjr



 
Reply With Quote
 
John Machin
Guest
Posts: n/a
 
      12-23-2007
On Dec 23, 2:39 pm, "Terry Reedy" <(E-Mail Removed)> wrote:
> "John Machin" <(E-Mail Removed)> wrote in message
>
> news:(E-Mail Removed)...
> | On Dec 23, 5:38 am, "Terry Reedy" <(E-Mail Removed)> wrote:
> | > 'Most flexible' in a different way is
> | >
> | > def searcher(rex):
> | > crex = re.compile(rex)
> | > def _(txt):
> | > return crex.search(txt)
> | > return _
> | >
> |
> | I see your obfuscatory ante and raise you several dots and
> | underscores:
>
> I will presume you are merely joking, but for the benefit of any beginning
> programmers reading this, the closure above is a standard functional idiom
> for partial evaluation of a function (in this this, re.search(crex,txt))
>
> | class Searcher(object):
> | def __init__(self, rex):
> | self.crex = re.compile(rex)
> | def __call__(self, txt):
> | return self.crex.search(txt)
>
> while this is, the equivalent OO version. Intermdiate Python programmers
> should know both.
>


Semi-joking; I thought that your offering of this:

def searcher(rex):
crex = re.compile(rex)
def _(txt):
return crex.search(txt)
return _
foo_searcher = searcher('foo')

was somewhat over-complicated, and possibly slower than already-
mentioned alternatives. The standard idiom etc etc it may be, but the
OP was interested in getting overhead out of his re searching loop.
Let's trim it a bit.

step 1:
def searcher(rex):
crexs = re.compile(rex).search
def _(txt):
return crexs(txt)
return _
foo_searcher = searcher('foo')

step 2:
def searcher(rex):
return re.compile(rex).search
foo_searcher = searcher('foo')

step 3:
foo_searcher = re.compile('foo').search

HTH,
John
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
.NET 2.0: Sending email on local domain to local exchange 2K server Jim in Arizona ASP .Net 8 01-24-2006 05:37 PM
Access local port or Running local exe file =?Utf-8?B?WVNVVA==?= ASP .Net 0 01-14-2006 12:41 AM
Playing a local mpeg file from a local HTML file... Lyndon HTML 1 07-25-2005 02:21 AM
Browser link to local file works when local, not work when servedfrom http lurker HTML 1 04-05-2005 05:10 AM
Can't use 'local' to find sql server instances on local machine karim ASP .Net 1 06-26-2003 09:17 PM



Advertisments