Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > packing things back to regular expression

Reply
Thread Tools

packing things back to regular expression

 
 
Amit Gupta
Guest
Posts: n/a
 
      02-20-2008
Hi

I wonder if python has a function to pack things back into regexp,
that has group names.

e.g:
exp = (<?P<name1>[a-z]+)
compiledexp = re.compile(exp)

Now, I have a dictionary "mytable = {"a" : "myname"}

Is there a way in re module, or elsewhere, where I can have it match
the contents from dictionary to the re-expression (and check that it
matches the rules) and than return the substituted string?

e.g
>> re.SomeNewFunc(compilexp, mytable)

"myname"
>> mytable = {"a" : "1"}
>> re.SomeNewFunc(compileexp, mytable)

ERROR



Thanks
A
 
Reply With Quote
 
 
 
 
Gary Herron
Guest
Posts: n/a
 
      02-20-2008
Amit Gupta wrote:
> Hi
>
> I wonder if python has a function to pack things back into regexp,
> that has group names.
>
> e.g:
> exp = (<?P<name1>[a-z]+)
> compiledexp = re.compile(exp)
>
> Now, I have a dictionary "mytable = {"a" : "myname"}
>
> Is there a way in re module, or elsewhere, where I can have it match
> the contents from dictionary to the re-expression (and check that it
> matches the rules) and than return the substituted string?
>

I'm not following what you're asking for until I get to the last two
words. The re module does have functions to do string substitution.
One or more occurrences of a pattern matched by an re can be replaces
with a given string. See sub and subn. Perhaps you can make one of
those do whatever it is you are trying to do.

Gary Herron

> e.g
>
>>> re.SomeNewFunc(compilexp, mytable)
>>>

> "myname"
>
>>> mytable = {"a" : "1"}
>>> re.SomeNewFunc(compileexp, mytable)
>>>

> ERROR
>
>
>
> Thanks
> A
>


 
Reply With Quote
 
 
 
 
Tim Chase
Guest
Posts: n/a
 
      02-20-2008
> mytable = {"a" : "myname"}
>>> re.SomeNewFunc(compilexp, mytable)

> "myname"


how does SomeNewFunc know to pull "a" as opposed to any other key?

>>> mytable = {"a" : "1"}
>>> re.SomeNewFunc(compileexp, mytable)

> ERROR


You could do something like one of the following 3 functions:

import re
ERROR = 'ERROR'
def some_new_func(table, regex):
"Return processed results for values matching regex"
result = {}
for k,v in table.iteritems():
m = regex.match(v)
if m:
result[k] = m.group(1)
else:
result[k] = ERROR
return result

def some_new_func2(table, regex, key):
"Get value (if matches regex) or ERROR based on key"
m = regex.match(table[key])
if m: return m.group(0)
return ERROR

def some_new_func3(table, regex):
"Sniff the desired key from the regexp (inefficient)"
for k,v in table.iteritems():
m = regex.match(v)
if m:
groupname, match = m.groupdict().iteritems().next()
if groupname == k:
return match
return ERROR

if __name__ == "__main__":
NAME = 'name1'
mytable = {
'a': 'myname',
'b': '1',
NAME: 'foo',
}
regexp = '(?P<%s>[a-z]+)' % NAME
print 'Using regex:'
print regexp
print '='*10

r = re.compile(regexp)
results = some_new_func(mytable, r)
print 'a: ', results['a']
print 'b: ', results['b']
print '='*10
print 'a: ', some_new_func2(mytable, r, 'a')
print 'b: ', some_new_func2(mytable, r, 'b')
print '='*10
print '%s: %s' % (NAME, some_new_func3(mytable, r))

Function#2 is the optimal solution, for single hits, whereas
Function#1 is best if you plan to repeatedly extract keys from
one set of processed results (the function only gets called
once). Function#3 is just ugly, and generally indicates that you
need to change your tactic

-tkc



 
Reply With Quote
 
Amit Gupta
Guest
Posts: n/a
 
      02-20-2008
Before I read the message: I screwed up.

Let me write again

>> x = re.compile("CL(?P<name1>[a-z]+)")

# group name "name1" is attached to the match of lowercase string of
alphabet
# Now I have a dictionary saying {"name1", "iamgood"}
# I would like a function, that takes x and my dictionary and return
"CLiamgood"
# If my dictionary instead have {"name1", "123"}, it gives error on
processingit
#
# In general, I have reg-expression where every non-trivial match has
a group-name. I want to do the reverse of reg-exp match. The function
can take reg-exp and replace the group-matches from dictionary
# I hope, this make it clear.

 
Reply With Quote
 
Steven D'Aprano
Guest
Posts: n/a
 
      02-21-2008
On Wed, 20 Feb 2008 11:36:20 -0800, Amit Gupta wrote:

> Before I read the message: I screwed up.
>
> Let me write again
>
>>> x = re.compile("CL(?P<name1>[a-z]+)")

> # group name "name1" is attached to the match of lowercase string of
> alphabet
> # Now I have a dictionary saying {"name1", "iamgood"}
> # I would like a function, that takes x and my dictionary and
> return "CLiamgood"
> # If my dictionary instead have {"name1", "123"}, it gives error on
> processingit
> #
> # In general, I have reg-expression where every non-trivial match has a
> group-name. I want to do the reverse of reg-exp match. The function can
> take reg-exp and replace the group-matches from dictionary
> # I hope, this make it clear.



Clear as mud. But I'm going to take a guess.

Are you trying to validate the data against the regular expression as
well as substitute values? That means your function needs to do something
like this:

(1) Take the regular expression object, and extract the string it was
made from. That way at least you know the regular expression was valid.

x = re.compile("CL(?P<name1>[a-z]+)") # validate the regex
x.pattern()

=> "CL(?P<name1>[a-z]+)"


(2) Split the string into sets of three pieces:

split("CL(?P<name1>[a-z]+)") # you need to write this function

=> ("CL", "(?P<name1>", "[a-z]+)")


(3) Mangle the first two pieces:

mangle("CL", "(?P<name1>") # you need to write this function

=> "CL%(name1)s"

(4) Validate the value in the dictionary:

d = {"name1", "123"}
validate("[a-z]+)", d)

=> raise exception

d = {"name1", "iamgood"}
validate("[a-z]+)", d)

=> return True


(5) If the validation step succeeded, then do the replacement:

"CL%(name1)s" % d

=> "CLiamgood"


Step (2), the splitter, will be the hardest because you essentially need
to parse the regular expression. You will need to decide how to handle
regexes with multiple "bits", including *nested* expressions, e.g.:

"CL(?P<name1>[a-z]+)XY(?:AB)[aeiou]+(?P<name2>CD(?P<name3>..)\?EF)"


Good luck.


--
Steven
 
Reply With Quote
 
MRAB
Guest
Posts: n/a
 
      02-21-2008
On Feb 20, 7:36 pm, Amit Gupta <(E-Mail Removed)> wrote:
> Before I read the message: I screwed up.
>
> Let me write again
>
> >> x = re.compile("CL(?P<name1>[a-z]+)")

>
> # group name "name1" is attached to the match of lowercase string of
> alphabet
> # Now I have a dictionary saying {"name1", "iamgood"}
> # I would like a function, that takes x and my dictionary and return
> "CLiamgood"
> # If my dictionary instead have {"name1", "123"}, it gives error on
> processingit
> #
> # In general, I have reg-expression where every non-trivial match has
> a group-name. I want to do the reverse of reg-exp match. The function
> can take reg-exp and replace the group-matches from dictionary
> # I hope, this make it clear.


If you want the string that matched the regex then you can use
group(0) (or just group()):

>>> x = re.compile("CL(?P<name1>[a-z]+)")
>>> m = x.search("something CLiamgood!something else")
>>> m.group()

'CLiamgood'
 
Reply With Quote
 
Paul McGuire
Guest
Posts: n/a
 
      02-21-2008
On Feb 20, 6:29*pm, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com.au> wrote:
> On Wed, 20 Feb 2008 11:36:20 -0800, Amit Gupta wrote:
> > Before I read the message: I screwed up.

>
> > Let me write again

>
> >>> x = re.compile("CL(?P<name1>[a-z]+)")

> > # group name "name1" is attached to the match of lowercase string of
> > alphabet
> > # Now I have a dictionary saying {"name1", "iamgood"}
> > # I would like a function, that takes x and my dictionary and
> > return "CLiamgood"
> > # If my dictionary instead have {"name1", "123"}, it gives error on
> > processingit
> > #
> > # In general, I have reg-expression where every non-trivial match has a
> > group-name. I want to do the reverse of reg-exp match. The function can
> > take reg-exp and replace the group-matches from dictionary
> > # I hope, this make it clear.

>

<snip>
>
> Good luck.
>
> --
> Steven


Oh, pshaw! Try this pyparsing ditty.

-- Paul
http://pyparsing.wikispaces.com



from pyparsing import *
import re

# replace patterns of (?P<name>xxx) with dict
# values iff value matches 'xxx' as re

LPAR,RPAR,LT,GT = map(Suppress,"()<>")
nameFlag = Suppress("?P")
rechars = printables.replace(")","").replace("(","")+" "
regex = Forward()("fld_re")
namedField = (nameFlag + \
LT + Word(alphas,alphanums+"_")("fld_name") + GT + \
regex )
regex << Combine(OneOrMore(Word(rechars) |
r"\(" | r"\)" |
nestedExpr(LPAR, RPAR, namedField |
regex,
ignoreExpr=None ) ))

def fillRE(reString, nameDict):
def fieldPA(tokens):
fieldRE = tokens.fld_re
fieldName = tokens.fld_name
if fieldName not in nameDict:
raise ParseFatalException(
"name '%s' not defined in name dict" %
(fieldName,) )
fieldTranslation = nameDict[fieldName]
if (re.match(fieldRE, fieldTranslation)):
return fieldTranslation
else:
raise ParseFatalException(
"value '%s' does not match re '%s'" %
(fieldTranslation, fieldRE) )
namedField.setParseAction(fieldPA)
try:
return (LPAR + namedField + RPAR).transformString(reString)
except ParseBaseException, pe:
return pe.msg

# tests start here
testRE = r"CL(?P<name1>[a-z]+)"

# a simple test
test1 = { "name1" : "iamgood" }
print fillRE(testRE, test1)

# harder test, including nested names (have to be careful in
# constructing the names dict)
testRE = \
r"CL(?P<name1>[a-z]+)XY(?P<name4>(AB)[aeiou]+)" \
r"(?P<name2>CD(?P<name3>..)\?EF)"
test3 = { "name1" : "iamgoodZ",
"name2" : "CD@@?EF",
"name3" : "@@",
"name4" : "ABeieio",
}
print fillRE(testRE, test3)

# test a non-conforming field
test2 = { "name1" : "123" }
print fillRE(testRE, test2)


Prints:

CLiamgood
CLiamgoodZXYABeieioCD@@?EF
value '123' does not match re '[a-z]+'

 
Reply With Quote
 
Amit Gupta
Guest
Posts: n/a
 
      02-24-2008

> "CL(?P<name1>[a-z]+)XY(?:AB)[aeiou]+(?P<name2>CD(?P<name3>..)\?EF)"
>
> Good luck.
>
> --
> Steven


This is what I did in the end (in principle). Thanks.

A
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Seek xpath expression where an attribute name is a regular expression GIMME XML 3 12-29-2008 03:11 PM
vs2005 publish website doing bad things, bad things =?Utf-8?B?V2lsbGlhbSBTdWxsaXZhbg==?= ASP .Net 1 10-25-2006 06:18 PM
Info on packing regular tree-like structures into rectangles? paddy3118@netscape.net VHDL 2 12-08-2005 07:19 AM
Matching abitrary expression in a regular expression =?iso-8859-1?B?bW9vcJk=?= Java 8 12-02-2005 12:51 AM
Dynamically changing the regular expression of Regular Expression validator VSK ASP .Net 2 08-24-2003 02:47 PM



Advertisments