Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > regular expressions use

Reply
Thread Tools

regular expressions use

 
 
max(01)*
Guest
Posts: n/a
 
      08-22-2005
hi everyone.

i would like to do some uri-decoding, which means to translate patterns
like "%2b/dhg-%3b %7E" into "+/dhg-; ~": in practice, if a sequence like
"%2b" is found, it should be translated into one character whose hex
ascii code is 2b.

i did this:

....
import re
import sys

modello = re.compile("%([0-9a-f][0-9a-f])", re.IGNORECASE)

def funzione(corrispondenza):
return chr(eval('0x' + corrispondenza.group(1)))

for riga in sys.stdin:
riga = modello.sub(funzione, riga)
sys.stdout.write(riga)
....

please comment it. can it be made easily or more compactly? i am a
python regexp novice.

bye

max

ps: i was trying to pythonate this kind of perl code:

$riga =~ s/%([A-Fa-f0-9][A-Fa-f0-9])/chr(hex($1))/ge;
 
Reply With Quote
 
 
 
 
Peter Otten
Guest
Posts: n/a
 
      08-22-2005
max(01)* wrote:

> would like to do some uri-decoding, which means to translate patterns
> like "%2b/dhg-%3b %7E" into "+/dhg-; ~": in practice, if a sequence like
> "%2b" is found, it should be translated into one character whose hex
> ascii code is 2b.
>
> i did this:
>
> ...
> import re
> import sys
>
> modello = re.compile("%([0-9a-f][0-9a-f])", re.IGNORECASE)
>
> def funzione(corrispondenza):
> return*chr(eval('0x'*+*corrispondenza.group(1)))


You can specify the base for str to int conversion, e. g:

return*chr(int(corrispondenza.group(1), 16))

And then there is also urllib.unquote() in the library.

Peter

 
Reply With Quote
 
 
 
 
Paul McGuire
Guest
Posts: n/a
 
      08-22-2005
Perhaps a bit more verbose than your Perl regexp, here is a decoder
using pyparsing.

-- Paul

# download pyparsing at http://pyparsing.sourceforge.net
from pyparsing import Word,Combine

# define grammar for matching encoded characters
hexnums = "0123456789ABCDEFabcdef"
encodedChar = Combine( "%" + Word(hexnums,exact=2) )

# define and attach conversion action
def unencode(s,l,toks):
return chr(int(toks[0][1:],16))
encodedChar.setParseAction( unencode )

# transform test string
data = "%2b/dhg-%3b %7E"
print encodedChar.transformString( data )
"""
Prints "+/dhg-; ~":
"""

 
Reply With Quote
 
Fredrik Lundh
Guest
Posts: n/a
 
      08-22-2005
"max(01)*" <(E-Mail Removed)> wrote:

> i would like to do some uri-decoding, which means to translate patterns
> like "%2b/dhg-%3b %7E" into "+/dhg-; ~"


>>> import urllib
>>> urllib.unquote("%2b/dhg-%3b %7E")

'+/dhg-; ~'

</F>



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
RE: Don't use regular expressions to "validate" email addresses (was:Ineed some help with a regexp please) bruce Python 4 09-22-2006 10:17 PM
Request for Feedback; a module making it easier to use regular expressions. Kenneth McDonald Python 1 01-31-2005 10:48 PM
ANN: 'rex' 0.5, a module for easier creation and use of regular expressions. bp Python 0 06-27-2004 10:15 PM
ANN: 'rex', a module for easy creation and use of regular expressions Kenneth McDonald Python 0 06-10-2004 10:24 PM
Add custom regular expressions to the validation list of available expressions Jay Douglas ASP .Net 0 08-15-2003 10:19 PM



Advertisments