Go Back   Velocity Reviews > Newsgroups > Python
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

Python - Re: How to write replace string for object which will be substituted?[regexp]

 
Thread Tools Search this Thread
Old 08-05-2009, 12:28 PM   #1
Default Re: How to write replace string for object which will be substituted?[regexp]


MRAB wrote:
> ryniek90 wrote:
>> Hi.
>> I started learning regexp, and some things goes well, but most of
>> them still not.
>>
>> I've got problem with some regexp. Better post code here:
>>
>> "
>> >>> import re
>> >>> mail = '\\nname1 [at] mail [dot] com\nname2 [$at$]

>> mail [$dot$] com\n'
>> >>> mail

>> '\\nname1 [at] mail [dot] com\nname2 [$at$] mail
>> [$dot$] com\n'
>> >>> print mail

>>
>>
>> name1 [at] mail [dot] com
>> name2 [$at$] mail [$dot$] com
>>
>> >>> maail = re.sub('^\n|$\n', '', mail)
>> >>> print maail

>>
>> name1 [at] mail [dot] com
>> name2 [$at$] mail [$dot$] com
>> >>> maail = re.sub(' ', '', maail)
>> >>> print maail

>>
>> name1[at]mail[dot]com
>> name2[$at$]mail[$dot$]com
>> >>> maail = re.sub('\[at\]|\[\$at\$\]', '@', maail)
>> >>> print maail

>>
>> name1@mail[dot]com
>> name2@mail[$dot$]com
>> >>> maail = re.sub('\[dot\]|\[\$dot\$\]', '.', maail)
>> >>> print maail

>>
>>
>>
>> >>> #How must i write the replace string to replace all this

>> regexp's with just ONE command, in string 'mail' ?
>> >>> maail = re.sub('^\n|$\n|

>> |\[at\]|\[\$at\$\]|\[dot\]|\[\$dot\$\]', *?*, mail)
>> "
>>
>> How must i write that replace pattern (look at question mark), to
>> maek that substituion work? I didn't saw anything helpful while
>> reading Re doc and HowTo (from Python Doc). I tried with
>> 'MatchObject.group()' but something gone wrong - didn't wrote it right.
>> Is there more user friendly HowTo for Python Re, than this?
>>
>> I'm new to programming an regexp, sorry for inconvenience.
>>

> I don't think you can do it in one regex, nor would I want to. Just use
> the string's replace() method.
>
> >>> mail = '\\nname1 [at] mail [dot] com\nname2 [$at$]

> mail [$dot$] com\n'
> >>> mail

> '\\nname1 [at] mail [dot] com\nname2 [$at$] mail [$dot$]
> com\n'
> >>> print mail

>
>
> name1 [at] mail [dot] com
> name2 [$at$] mail [$dot$] com
>
> >>> maail = mail.strip()

>
> name1 [at] mail [dot] com
> name2 [$at$] mail [$dot$] com
>
> >>> maail = maail.replace(' ', '')
> >>> print maail

>
> name1[at]mail[dot]com
> name2[$at$]mail[$dot$]com
> >>> maail = maail.replace('[at]', '@').replace('[$at$]', '@')
> >>> print maail

>
> name1@mail[dot]com
> name2@mail[$dot$]com
> >>> maail = maail.replace('[dot]', '.').replace('[$dot$]', '.')
> >>> print maail

>
>
>

This is a good learning exercise demonstrating the impracticality of
regular expressions in a given situation. In the light of the
fascination regular expressions seem to exert in general, one might
conclude that knowing regular expressions in essence is knowing when not
to use them.

There is nothing wrong with cascading substitutions through multiple
expressions. The OP's solution wrapped up in a function and streamlined
for needless regex overkill might look something like this:

def translate (s):
s1 = s.strip () # Instead of: s1 = re.sub ('^\n|$\n', '', s)
s2 = s1.replace (' ', '') # Instead of: s2 = re.sub (' ', '', s1)
s3 = re.sub ('\[at\]|\[\$at\$\]', '@', s2)
s4 = re.sub ('\[dot\]|\[\$dot\$\]', '.', s3)
return s4

print translate (mail) # Tested

MRAB's solution using replace () avoids needless regex complexity, but
doesn't simplify tedious coding if the number of substitutions is
significant. Some time ago I proposed a little module I made to
alleviate the tedium. It would handle this case like this:

import SE
Translator = SE.SE ( ' (32)= [at]=@ [$at$]=@ [dot]=. [$dot$]=. ' )
print Translator (mail.strip ()) # Tested

So SE.SE compiles a string composed of any number of substitution
definitions into an object that translates anything given it. In a
running speed contest it would surely come in last, although in most
cases the disadvantage would be imperceptible. Another matter is coding
speed. Here the advantage is obvious, even with a set of substitutions
as small as this one, let alone with sets in the tens or even hundreds.
One inconspicuous but significant feature of SE is that it handles
precedence correctly if targets overlap (upstream over downstream and
long over short). As far as I know there's nothing in the Python system
handling substitution precedence. It always needs to be hand-coded from
one case to the next and that isn't exactly trivial.

SE can be downloaded from http://pypi.python.org/pypi/SE/2.3.

Frederic






Anthra Norell
  Reply With Quote
Old 08-05-2009, 01:21 PM   #2
ryniek
 
Posts: n/a
Default Re: How to write replace string for object which will be substituted?[regexp]
On 5 Sie, 13:28, Anthra Norell <anthra.nor...@bluewin.ch> wrote:
> MRAB wrote:
> > ryniek90 wrote:
> >> Hi.
> >> I started learning regexp, and some things goes well, but most of
> >> them still not.

>
> >> I've got problem with some regexp. Better post code here:

>
> >> "
> >> *>>> import re
> >> *>>> mail = '\nn...@mail.com\nname1 [at] mail [dot] com\nname2 [$at$]
> >> mail [$dot$] com\n'
> >> *>>> mail
> >> '\nn...@mail.com\nname1 [at] mail [dot] com\nname2 [$at$] mail
> >> [$dot$] com\n'
> >> *>>> print mail

>
> >> n...@mail.com
> >> name1 [at] mail [dot] com
> >> name2 [$at$] mail [$dot$] com

>
> >> *>>> maail = re.sub('^\n|$\n', '', mail)
> >> *>>> print maail
> >> n...@mail.com
> >> name1 [at] mail [dot] com
> >> name2 [$at$] mail [$dot$] com
> >> *>>> maail = re.sub(' ', '', maail)
> >> *>>> print maail
> >> n...@mail.com
> >> name1[at]mail[dot]com
> >> name2[$at$]mail[$dot$]com
> >> *>>> maail = re.sub('\[at\]|\[\$at\$\]', '@', maail)
> >> *>>> print maail
> >> n...@mail.com
> >> name1@mail[dot]com
> >> name2@mail[$dot$]com
> >> *>>> maail = re.sub('\[dot\]|\[\$dot\$\]', '.', maail)
> >> *>>> print maail
> >> n...@mail.com
> >> na...@mail.com
> >> na...@mail.com
> >> *>>> #How must i write the replace string to replace all this
> >> regexp's with just ONE command, in string 'mail' ?
> >> *>>> maail = re.sub('^\n|$\n|
> >> |\[at\]|\[\$at\$\]|\[dot\]|\[\$dot\$\]', *?*, mail)
> >> "

>
> >> How must i write that replace pattern (look at question mark), to
> >> maek that substituion work? I didn't saw anything helpful while
> >> reading Re doc and HowTo (from Python Doc). I tried with
> >> 'MatchObject.group()' but something gone wrong - didn't wrote it right..
> >> Is there more user friendly HowTo for Python Re, than this?

>
> >> I'm new to programming an regexp, sorry for inconvenience.

>
> > I don't think you can do it in one regex, nor would I want to. Just use
> > the string's replace() method.

>
> > >>> mail = '\nn...@mail.com\nname1 [at] mail [dot] com\nname2 [$at$]

> > mail [$dot$] com\n'
> > >>> mail

> > '\nn...@mail.com\nname1 [at] mail [dot] com\nname2 [$at$] mail [$dot$]
> > com\n'
> > >>> print mail

>
> > n...@mail.com
> > name1 [at] mail [dot] com
> > name2 [$at$] mail [$dot$] com

>
> > >>> maail = mail.strip()

> > n...@mail.com
> > name1 [at] mail [dot] com
> > name2 [$at$] mail [$dot$] com

>
> > >>> maail = maail.replace(' ', '')
> > >>> print maail

> > n...@mail.com
> > name1[at]mail[dot]com
> > name2[$at$]mail[$dot$]com
> > >>> maail = maail.replace('[at]', '@').replace('[$at$]', '@')
> > >>> print maail

> > n...@mail.com
> > name1@mail[dot]com
> > name2@mail[$dot$]com
> > >>> maail = maail.replace('[dot]', '.').replace('[$dot$]', '.')
> > >>> print maail

> > n...@mail.com
> > na...@mail.com
> > na...@mail.com

>
> This is a good learning exercise demonstrating the impracticality of
> regular expressions in a given situation. In the light of the
> fascination regular expressions seem to exert in general, one might
> conclude that knowing regular expressions in essence is knowing when not
> to use them.
>
> There is nothing wrong with cascading substitutions through multiple
> expressions. The OP's solution wrapped up in a function and streamlined
> for needless regex overkill might look something like this:
>
> def translate (s):
> * *s1 = s.strip () * * # Instead of: s1 = re.sub ('^\n|$\n', '', s)
> * *s2 = s1.replace (' ', '') * *# Instead of: s2 = re.sub (' ', '', s1)
> * *s3 = re.sub ('\[at\]|\[\$at\$\]', '@', s2)
> * *s4 = re.sub ('\[dot\]|\[\$dot\$\]', '.', s3)
> * *return s4
>
> print translate (mail) * # Tested
>
> MRAB's solution using replace () avoids needless regex complexity, but
> doesn't simplify tedious coding if the number of substitutions is
> significant. Some time ago I proposed a little module I made to
> alleviate the tedium. It would handle this case like this:
>
> import SE
> Translator = SE.SE ( ' (32)= [at]=@ [$at$]=@ [dot]=. [$dot$]=.. ' )
> print Translator (mail.strip ()) * # Tested
>
> So SE.SE compiles a string composed of any number of substitution
> definitions into an object that translates anything given it. In a
> running speed contest it would surely come in last, although in most
> cases the disadvantage would be imperceptible. Another matter is coding
> speed. Here the advantage is obvious, even with a set of substitutions
> as small as this one, let alone with sets in the tens or even hundreds.
> One inconspicuous but significant feature of SE is that it handles
> precedence correctly if targets overlap (upstream over downstream and
> long over short). As far as I know there's nothing in the Python system
> handling substitution precedence. It always needs to be hand-coded from
> one case to the next and that isn't exactly trivial.
>
> SE can be downloaded fromhttp://pypi.python.org/pypi/SE/2.3.
>
> Frederic


Thanks again.

I saw that MRAB is actively developing new implementation of re
module.
MRAB: You think it'd be good idea adding to Your project some best
features of SE module?
I didn't seen yet features of Your re module but will try to find time
even today, to see what's going on.

Greets


ryniek
  Reply With Quote
Old 08-05-2009, 01:21 PM   #3
ryniek
 
Posts: n/a
Default Re: How to write replace string for object which will be substituted?[regexp]
On 5 Sie, 13:28, Anthra Norell <anthra.nor...@bluewin.ch> wrote:
> MRAB wrote:
> > ryniek90 wrote:
> >> Hi.
> >> I started learning regexp, and some things goes well, but most of
> >> them still not.

>
> >> I've got problem with some regexp. Better post code here:

>
> >> "
> >> *>>> import re
> >> *>>> mail = '\nn...@mail.com\nname1 [at] mail [dot] com\nname2 [$at$]
> >> mail [$dot$] com\n'
> >> *>>> mail
> >> '\nn...@mail.com\nname1 [at] mail [dot] com\nname2 [$at$] mail
> >> [$dot$] com\n'
> >> *>>> print mail

>
> >> n...@mail.com
> >> name1 [at] mail [dot] com
> >> name2 [$at$] mail [$dot$] com

>
> >> *>>> maail = re.sub('^\n|$\n', '', mail)
> >> *>>> print maail
> >> n...@mail.com
> >> name1 [at] mail [dot] com
> >> name2 [$at$] mail [$dot$] com
> >> *>>> maail = re.sub(' ', '', maail)
> >> *>>> print maail
> >> n...@mail.com
> >> name1[at]mail[dot]com
> >> name2[$at$]mail[$dot$]com
> >> *>>> maail = re.sub('\[at\]|\[\$at\$\]', '@', maail)
> >> *>>> print maail
> >> n...@mail.com
> >> name1@mail[dot]com
> >> name2@mail[$dot$]com
> >> *>>> maail = re.sub('\[dot\]|\[\$dot\$\]', '.', maail)
> >> *>>> print maail
> >> n...@mail.com
> >> na...@mail.com
> >> na...@mail.com
> >> *>>> #How must i write the replace string to replace all this
> >> regexp's with just ONE command, in string 'mail' ?
> >> *>>> maail = re.sub('^\n|$\n|
> >> |\[at\]|\[\$at\$\]|\[dot\]|\[\$dot\$\]', *?*, mail)
> >> "

>
> >> How must i write that replace pattern (look at question mark), to
> >> maek that substituion work? I didn't saw anything helpful while
> >> reading Re doc and HowTo (from Python Doc). I tried with
> >> 'MatchObject.group()' but something gone wrong - didn't wrote it right..
> >> Is there more user friendly HowTo for Python Re, than this?

>
> >> I'm new to programming an regexp, sorry for inconvenience.

>
> > I don't think you can do it in one regex, nor would I want to. Just use
> > the string's replace() method.

>
> > >>> mail = '\nn...@mail.com\nname1 [at] mail [dot] com\nname2 [$at$]

> > mail [$dot$] com\n'
> > >>> mail

> > '\nn...@mail.com\nname1 [at] mail [dot] com\nname2 [$at$] mail [$dot$]
> > com\n'
> > >>> print mail

>
> > n...@mail.com
> > name1 [at] mail [dot] com
> > name2 [$at$] mail [$dot$] com

>
> > >>> maail = mail.strip()

> > n...@mail.com
> > name1 [at] mail [dot] com
> > name2 [$at$] mail [$dot$] com

>
> > >>> maail = maail.replace(' ', '')
> > >>> print maail

> > n...@mail.com
> > name1[at]mail[dot]com
> > name2[$at$]mail[$dot$]com
> > >>> maail = maail.replace('[at]', '@').replace('[$at$]', '@')
> > >>> print maail

> > n...@mail.com
> > name1@mail[dot]com
> > name2@mail[$dot$]com
> > >>> maail = maail.replace('[dot]', '.').replace('[$dot$]', '.')
> > >>> print maail

> > n...@mail.com
> > na...@mail.com
> > na...@mail.com

>
> This is a good learning exercise demonstrating the impracticality of
> regular expressions in a given situation. In the light of the
> fascination regular expressions seem to exert in general, one might
> conclude that knowing regular expressions in essence is knowing when not
> to use them.
>
> There is nothing wrong with cascading substitutions through multiple
> expressions. The OP's solution wrapped up in a function and streamlined
> for needless regex overkill might look something like this:
>
> def translate (s):
> * *s1 = s.strip () * * # Instead of: s1 = re.sub ('^\n|$\n', '', s)
> * *s2 = s1.replace (' ', '') * *# Instead of: s2 = re.sub (' ', '', s1)
> * *s3 = re.sub ('\[at\]|\[\$at\$\]', '@', s2)
> * *s4 = re.sub ('\[dot\]|\[\$dot\$\]', '.', s3)
> * *return s4
>
> print translate (mail) * # Tested
>
> MRAB's solution using replace () avoids needless regex complexity, but
> doesn't simplify tedious coding if the number of substitutions is
> significant. Some time ago I proposed a little module I made to
> alleviate the tedium. It would handle this case like this:
>
> import SE
> Translator = SE.SE ( ' (32)= [at]=@ [$at$]=@ [dot]=. [$dot$]=.. ' )
> print Translator (mail.strip ()) * # Tested
>
> So SE.SE compiles a string composed of any number of substitution
> definitions into an object that translates anything given it. In a
> running speed contest it would surely come in last, although in most
> cases the disadvantage would be imperceptible. Another matter is coding
> speed. Here the advantage is obvious, even with a set of substitutions
> as small as this one, let alone with sets in the tens or even hundreds.
> One inconspicuous but significant feature of SE is that it handles
> precedence correctly if targets overlap (upstream over downstream and
> long over short). As far as I know there's nothing in the Python system
> handling substitution precedence. It always needs to be hand-coded from
> one case to the next and that isn't exactly trivial.
>
> SE can be downloaded fromhttp://pypi.python.org/pypi/SE/2.3.
>
> Frederic


Thanks again.

I saw that MRAB is actively developing new implementation of re
module.
MRAB: You think it'd be good idea adding to Your project some best
features of SE module?
I didn't seen yet features of Your re module but will try to find time
even today, to see what's going on.

Greets


ryniek
  Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Give you enough string functions in Java web reporting tool freezea Software 0 10-08-2009 09:03 AM
Java String Problems rbnbenjamin General Help Related Topics 0 02-03-2009 11:02 PM
ASP.NET: Asign Users in Roles(Array.IndexOf(Of String) method) msandlana Software 0 04-25-2008 06:37 AM
Hidden linebreaks in string? VB.NET Jiggy Software 0 04-23-2008 02:18 PM




SEO by vBSEO 3.3.2 ©2009, Crawlability, Inc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46