Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Bug? concatenate a number to a backreference: re.sub(r'(zzz:)xxx',r'\1'+str(4444), somevar)

Reply
Thread Tools

Bug? concatenate a number to a backreference: re.sub(r'(zzz:)xxx',r'\1'+str(4444), somevar)

 
 
abdulet
Guest
Posts: n/a
 
      10-23-2009
Well its this normal? i want to concatenate a number to a
backreference in a regular expression. Im working in a multprocess
script so the first what i think is in an error in the multiprocess
logic but what a sorprise!!! when arrived to this conclussion after
some time debugging i see that:

import re
aa = "zzzxx"
re.sub(r'(zzz.*',r'\1'+str(3333),aa)
'[33'

¿?¿?¿? well lets put a : after the backreference

aa = "zzzxx"
re.sub(r'(zzz).*',r'\1:'+str(3333),aa)
'zzz:3333'

now its the expected result.... so
should i expect that python concatenate the string to the
backreference before substitute the backreference? or is a bug

tested on:
Python 2.6.2 (r262:71605, Apr 14 2009, 22:40:02) [MSC v.1500 32 bit
(Intel)] on win32
Python 2.5.2 (r252:60911, Jan 4 2009, 17:40:26) [GCC 4.3.2] on linux2

with the same result

Cheers!
 
Reply With Quote
 
 
 
 
Peter Otten
Guest
Posts: n/a
 
      10-23-2009
abdulet wrote:

> Well its this normal? i want to concatenate a number to a
> backreference in a regular expression. Im working in a multprocess
> script so the first what i think is in an error in the multiprocess
> logic but what a sorprise!!! when arrived to this conclussion after
> some time debugging i see that:
>
> import re
> aa = "zzzxx"
> re.sub(r'(zzz.*',r'\1'+str(3333),aa)
> '[33'


If you perform the addition you get r"\13333". How should the regular
expression engine interpret that? As the backreference to group 1, 13, ...
or 13333? It picks something completely different, "[33", because "\133" is
the octal escape sequence for "[":

>>> chr(0133)

'['

You can avoid the ambiguity with

extra = str(number)
extra = re.escape(extra)
re.sub(expr r"\g<1>" + extra, text)

The re.escape() step is not necessary here, but a good idea in the general
case when extra is an arbitrary string.

Peter

 
Reply With Quote
 
 
 
 
abdulet
Guest
Posts: n/a
 
      10-23-2009
On 23 oct, 13:54, Peter Otten <__pete...@web.de> wrote:
> abdulet wrote:
> > Well its this normal? i want to concatenate a number to a
> > backreference in a regular expression. Im working in a multprocess
> > script so the first what i think is in an error in the multiprocess
> > logic but what a sorprise!!! when arrived to this conclussion after
> > some time debugging i see that:

>
> > import re
> > aa = "zzzxx"
> > re.sub(r'(zzz.*',r'\1'+str(3333),aa)
> > '[33'

>
> If you perform the addition you get r"\13333". How should the regular
> expression engine interpret that? As the backreference to group 1, 13, ...
> or 13333? It picks something completely different, "[33", because "\133" is
> the octal escape sequence for "[":
>
> >>> chr(0133)

>
> '['
>
> You can avoid the ambiguity with
>
> extra = str(number)
> extra = re.escape(extra)
> re.sub(expr r"\g<1>" + extra, text)
>
> The re.escape() step is not necessary here, but a good idea in the general
> case when extra is an arbitrary string.
>
> Peter

Aha!!! nice thanks i don't see that part of the re module
documentation and it was in front of my eyes ( like always its
something silly jjj so thanks again and yes!! is a nice idea to escape
the variable

cheers
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
concatenate variable number of arguments and types to a char * suresh C++ 17 08-19-2011 10:08 PM
Valid one time variable assign via var someVar = (this = Object) ? Marijn Javascript 8 11-24-2007 10:35 PM
re.sub() backreference bug? jemminger@gmail.com Python 4 08-18-2006 12:47 AM
backreference in regexp Fredrik Lundh Python 2 01-31-2006 02:02 PM
a backreference problem? Geoff Cox Perl Misc 13 08-24-2003 11:03 PM



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57