Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > re.I slowness

Reply
Thread Tools

re.I slowness

 
 
vvikram@gmail.com
Guest
Posts: n/a
 
      03-30-2006
We process a lot of messages in a file based on some regex pattern(s)
we have in a db.
If I compile the regex using re.I, the processing time is substantially
more than if I
don't i.e using re.I is slow.

However, more surprisingly, if we do something on the lines of :

s = <regex string>
s = s.lower()
t = dict([(k, '[%s%s]' % (k, k.upper())) for k in
string.ascii_lowercase])
for k in t: s = s.replace(k, t[k])
re.compile(s)
.......

its much better than using plainly re.I.

So the qns are:
a) Why is re.I so slow in general?
b) What is the underlying implementation used and what is wrong, if
any,
with above method and why is it not used instead?

Thanks
Vikram

 
Reply With Quote
 
 
 
 
Paul McGuire
Guest
Posts: n/a
 
      03-30-2006
<(E-Mail Removed)> wrote in message
news:(E-Mail Removed) ups.com...
> We process a lot of messages in a file based on some regex pattern(s)
> we have in a db.
> If I compile the regex using re.I, the processing time is substantially
> more than if I
> don't i.e using re.I is slow.
>
> However, more surprisingly, if we do something on the lines of :
>
> s = <regex string>
> s = s.lower()
> t = dict([(k, '[%s%s]' % (k, k.upper())) for k in
> string.ascii_lowercase])
> for k in t: s = s.replace(k, t[k])
> re.compile(s)
> ......
>
> its much better than using plainly re.I.
>
> So the qns are:
> a) Why is re.I so slow in general?
> b) What is the underlying implementation used and what is wrong, if
> any,
> with above method and why is it not used instead?
>
> Thanks
> Vikram
>

Can't tell you why re.I is slow, but perhaps this expression will make your
RE transform a little plainer (no need to create that dictionary of uppers
and lowers).

s = <regex string>
makeReAlphaCharLowerOrUpper = lambda c : c.isalpha() and "[%s%s]" %
(c.lower(),c.upper()) or c
s_optimized = "".join( makeReAlphaCharLowerOrUpper(k) for k in s)

or

s_optimized = "".join( map( makeReAlphaCharLowerOrUpper, s ) )


Just curious, but what happens if your RE contains something like this
spelling check error finder:
"[^c]ei"
(looking for violations of "i before e except after c")

Can []'s nest in an RE?

-- Paul


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Moz .8 "Save As" dialog slowness Stubby Firefox 4 06-29-2005 04:47 PM
PIX TCP slowness when routing & doing NAT bradya@gmail.com Cisco 1 06-22-2005 10:21 PM
EIGRP slowness and backup jmiklo Cisco 2 11-23-2004 07:32 AM
Netbios over IP slowness with 3600 Ciscos, IMAs and ATM The Prisoner Cisco 2 02-03-2004 03:00 PM
Problems With Frame Relay Slowness Phin Cisco 0 01-22-2004 03:47 AM



Advertisments