Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Wildcards for regexps?

Reply
Thread Tools

Wildcards for regexps?

 
 
ssecorp
Guest
Posts: n/a
 
      08-11-2008
If I have an expression like "bob marley" and I want to match
everything with one letter wrong, how would I do?
so "bob narely" and "vob marley" should match etc.
 
Reply With Quote
 
 
 
 
Paul McGuire
Guest
Posts: n/a
 
      08-11-2008
On Aug 10, 11:10*pm, ssecorp <(E-Mail Removed)> wrote:
> If I have an expression like "bob marley" and I want to match
> everything with one letter wrong, how would I do?
> so "bob narely" and "vob marley" should match etc.


At first, I was going to suggest the brute force solution:

".ob marley|b.b marley|bo. marley|bob.marley|bob .arley|bob m.rley|bob
ma.ley|bob mar.ey|bob marl.y|bob marle."

But then I realized that after matching the initial 'b', later
alternative matches wouldn't need to keep retesting for a leading 'b',
so here is a recursive re that does not go back to match previously
matched characters:

".ob marley|b(.b marley|o(. marley|b(.marley| (.arley|m(.rley|a(.ley|
r(.ey|l(.y|e.))))))))"

Here are some functions to generate these monstrosities:

base = "bob marley"

def makeOffByOneMatchRE(s):
return "|".join(s[:i]+'.'+s[i+1:] for i in range(len(s)))
re_string = makeOffByOneMatchRE(base)
print re_string

def makeOffByOneMatchRE(s,i=0):
if i==len(s)-2:
return '.' + s[-1] + '|' + s[-2] + '.'
return '.' + s[i+1:] + '|' + s[i] + '(' + makeOffByOneMatchRE(s,i
+1) + ')'
re_string = makeOffByOneMatchRE(base)
print re_string


-- Paul
 
Reply With Quote
 
 
 
 
Diez B. Roggisch
Guest
Posts: n/a
 
      08-11-2008
ssecorp schrieb:
> If I have an expression like "bob marley" and I want to match
> everything with one letter wrong, how would I do?
> so "bob narely" and "vob marley" should match etc.


Fuzzy matching is better done using Levensthein-distance [1] or
n-gram-matching [2].


Diez


[1] http://en.wikipedia.org/wiki/Levenshtein_distance
[2] http://en.wikipedia.org/wiki/Ngram#n...imate_matching
 
Reply With Quote
 
Timothy Grant
Guest
Posts: n/a
 
      08-11-2008
On Sun, Aug 10, 2008 at 9:10 PM, ssecorp <(E-Mail Removed)> wrote:
> If I have an expression like "bob marley" and I want to match
> everything with one letter wrong, how would I do?
> so "bob narely" and "vob marley" should match etc.


At one point I needed something like this so did a straight port of
the double-metaphone code to python.

It's horrible, it's ugly, it's non-pythonic in ways that make me
cringe, it has no unit tests, but it does work.

--
Stand Fast,
tjg. [Timothy Grant]
 
Reply With Quote
 
Aahz
Guest
Posts: n/a
 
      08-12-2008
In article <(E-Mail Removed)>,
ssecorp <(E-Mail Removed)> wrote:
>
>If I have an expression like "bob marley" and I want to match
>everything with one letter wrong, how would I do?
>so "bob narely" and "vob marley" should match etc.


difflib
--
Aahz ((E-Mail Removed)) <*> http://www.pythoncraft.com/

Adopt A Process -- stop killing all your children!
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
wildcards ? Martin Bilgrav Cisco 2 09-05-2005 06:24 PM
File names for monitoring must have absolute paths, and no wildcards. msnews.microsoft.com ASP .Net 0 07-18-2004 08:41 PM
DataView filters with wildcards DotNetJunkies User ASP .Net 0 05-24-2004 09:25 PM
Negative Lookbehind and Wildcards Thomas F. O'Connell Perl 1 02-28-2004 01:50 PM
Struts: Using Wildcards in ActionMapping Josh Martin Java 6 11-23-2003 06:06 PM



Advertisments