Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   regular expressions use (http://www.velocityreviews.com/forums/t348309-regular-expressions-use.html)

max(01)* 08-22-2005 10:08 AM

regular expressions use
 
hi everyone.

i would like to do some uri-decoding, which means to translate patterns
like "%2b/dhg-%3b %7E" into "+/dhg-; ~": in practice, if a sequence like
"%2b" is found, it should be translated into one character whose hex
ascii code is 2b.

i did this:

....
import re
import sys

modello = re.compile("%([0-9a-f][0-9a-f])", re.IGNORECASE)

def funzione(corrispondenza):
return chr(eval('0x' + corrispondenza.group(1)))

for riga in sys.stdin:
riga = modello.sub(funzione, riga)
sys.stdout.write(riga)
....

please comment it. can it be made easily or more compactly? i am a
python regexp novice.

bye

max

ps: i was trying to pythonate this kind of perl code:

$riga =~ s/%([A-Fa-f0-9][A-Fa-f0-9])/chr(hex($1))/ge;

Peter Otten 08-22-2005 10:20 AM

Re: regular expressions use
 
max(01)* wrote:

> would like to do some uri-decoding, which means to translate patterns
> like "%2b/dhg-%3b %7E" into "+/dhg-; ~": in practice, if a sequence like
> "%2b" is found, it should be translated into one character whose hex
> ascii code is 2b.
>
> i did this:
>
> ...
> import re
> import sys
>
> modello = re.compile("%([0-9a-f][0-9a-f])", re.IGNORECASE)
>
> def funzione(corrispondenza):
> return*chr(eval('0x'*+*corrispondenza.group(1)))


You can specify the base for str to int conversion, e. g:

return*chr(int(corrispondenza.group(1), 16))

And then there is also urllib.unquote() in the library.

Peter


Paul McGuire 08-22-2005 12:10 PM

Re: regular expressions use
 
Perhaps a bit more verbose than your Perl regexp, here is a decoder
using pyparsing.

-- Paul

# download pyparsing at http://pyparsing.sourceforge.net
from pyparsing import Word,Combine

# define grammar for matching encoded characters
hexnums = "0123456789ABCDEFabcdef"
encodedChar = Combine( "%" + Word(hexnums,exact=2) )

# define and attach conversion action
def unencode(s,l,toks):
return chr(int(toks[0][1:],16))
encodedChar.setParseAction( unencode )

# transform test string
data = "%2b/dhg-%3b %7E"
print encodedChar.transformString( data )
"""
Prints "+/dhg-; ~":
"""


Fredrik Lundh 08-22-2005 12:18 PM

Re: regular expressions use
 
"max(01)*" <max2@fisso.casa> wrote:

> i would like to do some uri-decoding, which means to translate patterns
> like "%2b/dhg-%3b %7E" into "+/dhg-; ~"


>>> import urllib
>>> urllib.unquote("%2b/dhg-%3b %7E")

'+/dhg-; ~'

</F>





All times are GMT. The time now is 08:00 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.