Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Conversion of perl based regex to python method

Reply
Thread Tools

Conversion of perl based regex to python method

 
 
Andrew Robert
Guest
Posts: n/a
 
      05-24-2006
I have two Perl expressions


If windows:

perl -ple "s/([^\w\s])/sprintf(q#%%%2X#, ord $1)/ge" somefile.txt

If posix

perl -ple 's/([^\w\s])/sprintf("%%%2X", ord $1)/ge' somefile.txt



The [^\w\s] is a negated expression stating that any character
a-zA-Z0-9_, space or tab is ignored.

The () captures whatever matches and throws it into the $1 for
processing by the sprintf

In this case, %%%2X which is a three character hex value.

How would you convert this to a python equivalent using the re or
similar module?

I've begun reading about using re expressions at
http://www.amk.ca/python/howto/regex/ but I am still hazy on implementation.

Any help you can provide would be greatly appreciated.

Thanks,
Andy
 
Reply With Quote
 
 
 
 
Andrew Robert
Guest
Posts: n/a
 
      05-24-2006
Andrew Robert wrote:
> I have two Perl expressions
>
>
> If windows:
>
> perl -ple "s/([^\w\s])/sprintf(q#%%%2X#, ord $1)/ge" somefile.txt
>
> If posix
>
> perl -ple 's/([^\w\s])/sprintf("%%%2X", ord $1)/ge' somefile.txt
>
>
>
> The [^\w\s] is a negated expression stating that any character
> a-zA-Z0-9_, space or tab is ignored.
>
> The () captures whatever matches and throws it into the $1 for
> processing by the sprintf
>
> In this case, %%%2X which is a three character hex value.
>
> How would you convert this to a python equivalent using the re or
> similar module?
>
> I've begun reading about using re expressions at
> http://www.amk.ca/python/howto/regex/ but I am still hazy on implementation.
>
> Any help you can provide would be greatly appreciated.
>
> Thanks,
> Andy

Okay.. I got part of it..

The code/results below seem to do the first part of the expression.

I believe the next part is iterating across each of the characters,
evaluate the results and replace with hex as needed.


# Import the module
import re

# Open test file
file=open(r'm:\mq\mq\scripts\testme.txt','r')

# Read in a sample line
line=file.readline()

# Compile expression to exclude all characters plus space/tab
pattern=re.compile('[^\w\s]')

# Look to see if I can find a non-standard character
# from test line #! C:\Python24\Python

var=pattern.match('!')

# gotcha!
print var
<_sre.SRE_Match object at 0x009DA8E0

# I got
print var.group()

!

# See if pattern will come back with something it shouldn't
var =pattern.match('C')
print var

#I got
None



Instead of being so linear, I was thinking that this might be closer.
Got to figure out the hex line but then we are golden


# Evaluate captured character as hex
def ret_hex(ch):
return chr((ord(ch) + 1) % )

# Evaluate the value of whatever was matched
def eval_match(match):
return ret_hex(match.group(0))

# open file
file = open(r'm:\mq\mq\scripts\testme.txt','r')

# Read each line, pass any matches on line to function
for line in file.readlines():
re.sub('[^\w\s]',eval_match, line)
 
Reply With Quote
 
 
 
 
Peter Otten
Guest
Posts: n/a
 
      05-25-2006
Andrew Robert wrote:

Wanted:

> perl -ple 's/([^\w\s])/sprintf("%%%2X", ord $1)/ge'**somefile.txt


Got:

> # Evaluate captured character as hex
> def ret_hex(ch):
> return*chr((ord(ch)*+*1)*%*)


Make it compile at least before posting

> # Evaluate the value of whatever was matched
> def eval_match(match):
> return*ret_hex(match.group(0))
>
> # open file
> file = open(r'm:\mq\mq\scripts\testme.txt','r')
>
> # Read each line, pass any matches on line to function
> for line in file.readlines():
> re.sub('[^\w\s]',eval_match,*line)


for line in file:
...

without readlines() is better because it doesn't read the whole file into
memory first. If you want to read data from files passed as commandline
args or from stdin you can use fileinput.input():

import re
import sys
import fileinput

def replace(match):
return "%%%2X" % ord(match.group(0))

for line in fileinput.input():
sys.stdout.write(re.sub("[^\w\s]", replace, line))

Peter

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
String Pattern Matching: regex and Python regex documentation Xah Lee Python 8 09-26-2006 03:24 PM
String Pattern Matching: regex and Python regex documentation Xah Lee Java 1 09-22-2006 07:11 PM
Java regex imposture re: Perl regex compatibility a_c_Attlee@yahoo.com Java 2 05-06-2005 12:16 AM
perl regex to java regex Rick Venter Java 5 11-06-2003 10:55 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57