Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Regex not matching a string

Reply
Thread Tools

Regex not matching a string

 
 
python.prog29@gmail.com
Guest
Posts: n/a
 
      01-09-2013
Hi All -


In the following code ,am trying to remove a multi line - comment that contains "This is a test comment" for some reason the regex is not matching.. can anyone provide inputs on why it is so?

import os
import sys
import re
import fnmatch

def find_and_remove(haystack, needle):
pattern = re.compile(r'/\*.*?'+ needle + '.*?\*/', re.DOTALL)
return re.sub(pattern, "", haystack)

for path,dirs,files in os.walk(sys.argv[1]):
for fname in files:
for pat in ['*.cpp','*.c','*.h','*.txt']:
if fnmatch.fnmatch(fname,pat):
fullname = os.path.join(path,fname)
# put all the text into f and read and replace...
f = open(fullname).read()
result = find_and_remove(f, r"This is a test comment")
print result
 
Reply With Quote
 
 
 
 
Steven D'Aprano
Guest
Posts: n/a
 
      01-09-2013
On Wed, 09 Jan 2013 02:08:23 -0800, python.prog29 wrote:

> Hi All -
>
>
> In the following code ,am trying to remove a multi line - comment that
> contains "This is a test comment" for some reason the regex is not
> matching.. can anyone provide inputs on why it is so?


It works for me.

Some observations:

Perhaps you should consider using the glob module rather than manually
using fnmatch. That's what glob does.

Also, you never actually write to the files, is that deliberate?

Finally, perhaps your regex simply doesn't match what you think it
matches. Do you actually have any files containing the needle

"/* ... This is a test comment ... */"

(where the ... are any characters) exactly as shown?

Instead of giving us all the irrelevant code that has nothing to do with
matching a regex, you should come up with a simple piece of example code
that demonstrates your problem. Or, in this case, *fails* to demonstrate
the problem.

import re
haystack = "aaa\naaa /*xxxThis is a test comment \nxxx*/aaa\naaa\n"
needle = "This is a test comment"
pattern = re.compile(r'/\*.*?'+ needle + '.*?\*/', re.DOTALL)
print haystack
print re.sub(pattern, "", haystack)


--
Steven
 
Reply With Quote
 
 
 
 
Peter Otten
Guest
Posts: n/a
 
      01-09-2013
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

> In the following code ,am trying to remove a multi line - comment that
> contains "This is a test comment" for some reason the regex is not
> matching.. can anyone provide inputs on why it is so?


> def find_and_remove(haystack, needle):
> pattern = re.compile(r'/\*.*?'+ needle + '.*?\*/', re.DOTALL)
> return re.sub(pattern, "", haystack)


If a comment does not contain the needle "/\*.*?" extends over the end of
that comment:

>>> re.compile(r"/\*.*?xxx").search("/* xxx */").group()

'/* xxx'
>>> re.compile(r"/\*.*?xxx").search("/* yyy */ /* xxx */").group()

'/* yyy */ /* xxx'


One solution may be a substitution function:

>>> def sub(match, needle="xxx"):

.... s = match.group()
.... if needle in s:
.... return ""
.... else:
.... return s
....
>>> re.compile(r"/\*.*?\*/").sub(sub, "/* yyy */ /* xxx */")

'/* yyy */ '


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Regex testing and UTF8 awarenes or Regex and numeric pattern matching sln@netherlands.com Perl Misc 2 03-10-2009 03:51 AM
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
String Pattern Matching: regex and Python regex documentation Xah Lee Python 8 09-26-2006 03:24 PM
String Pattern Matching: regex and Python regex documentation Xah Lee Perl Misc 2 09-25-2006 03:15 AM
String Pattern Matching: regex and Python regex documentation Xah Lee Java 1 09-22-2006 07:11 PM



Advertisments