Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Re: Regular Expression for Finding and Deleting comments

Reply
Thread Tools

Re: Regular Expression for Finding and Deleting comments

 
 
Jeremy
Guest
Posts: n/a
 
      01-04-2011
On Tuesday, January 4, 2011 11:26:48 AM UTC-7, MRAB wrote:
> On 04/01/2011 17:11, Jeremy wrote:
> > I am trying to write a regular expression that finds and deletes (replaces with nothing) comments in a string/file. Comments are defined by the first non-whitespace character is a 'c' or a dollar sign somewhere in the line. I want to replace these comments with nothing which isn't too hard. The trouble is, the comments are replaced with a new-line; or the new-line isn't captured in the regular expression.
> >
> > Below, I have copied a minimal example. Can someone help?
> >
> > Thanks,
> > Jeremy
> >
> >
> > import re
> >
> > text = """ c
> > C - Second full line comment (first comment had no text)
> > c Third full line comment
> > F44:N 2 $ Inline comments start with dollar sign and go to end of line"""
> >
> > commentPattern = re.compile("""
> > (^\s*?c\s*?.*?| # Comment start with c or C
> > \$.*?)$\n # Comment starting with $
> > """, re.VERBOSE|re.MULTILINE|re.IGNORECASE)
> >

> Part of the problem is that you're not using raw string literals or
> doubling the backslashes.
>
> Try soemthing like this:
>
> commentPattern = re.compile(r"""
> (^[ \t]*c.*\n| # Comment start with c or C
> [ \t]*\$.*) # Comment starting with $
> """, re.VERBOSE|re.MULTILINE|re.IGNORECASE)


Using a raw string literal fixed the problem for me. Thanks for the suggestion. Why is that so important?

Jeremy

 
Reply With Quote
 
 
 
 
MRAB
Guest
Posts: n/a
 
      01-04-2011
On 04/01/2011 19:37, Jeremy wrote:
> On Tuesday, January 4, 2011 11:26:48 AM UTC-7, MRAB wrote:
>> On 04/01/2011 17:11, Jeremy wrote:
>>> I am trying to write a regular expression that finds and deletes (replaces with nothing) comments in a string/file. Comments are defined by the first non-whitespace character is a 'c' or a dollar sign somewhere in the line. I want to replace these comments with nothing which isn't too hard. The trouble is, the comments are replaced with a new-line; or the new-line isn't captured in the regular expression.
>>>
>>> Below, I have copied a minimal example. Can someone help?
>>>
>>> Thanks,
>>> Jeremy
>>>
>>>
>>> import re
>>>
>>> text = """ c
>>> C - Second full line comment (first comment had no text)
>>> c Third full line comment
>>> F44:N 2 $ Inline comments start with dollar sign and go to end of line"""
>>>
>>> commentPattern = re.compile("""
>>> (^\s*?c\s*?.*?| # Comment start with c or C
>>> \$.*?)$\n # Comment starting with $
>>> """, re.VERBOSE|re.MULTILINE|re.IGNORECASE)
>>>

>> Part of the problem is that you're not using raw string literals or
>> doubling the backslashes.
>>
>> Try soemthing like this:
>>
>> commentPattern = re.compile(r"""
>> (^[ \t]*c.*\n| # Comment start with c or C
>> [ \t]*\$.*) # Comment starting with $
>> """, re.VERBOSE|re.MULTILINE|re.IGNORECASE)

>
> Using a raw string literal fixed the problem for me. Thanks for the suggestion. Why is that so important?
>

Regexes often use escape sequences, but so do string literals, and a
sequence which is intended for the regex engine might not get passed
along correctly. For example, in a normal string literal \b means
'backspace' and will be passed to the regex engine as that; in a regex
it usually means 'word boundary':

A regex for "the" as a word: \bthe\b

As a raw string literal: r"\bthe\b"

As a normal string literal: "\\bthe\\b"

"\bthe\b" means: backspace + "the" + backspace
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
FAQ 6.11 How do I use a regular expression to strip C style comments from a file? PerlFAQ Server Perl Misc 0 02-10-2011 11:00 PM
Regular Expression for Finding and Deleting comments Jeremy Python 1 01-04-2011 06:26 PM
Regular expression that skips single line comments? martinjamesevans@gmail.com Python 5 01-20-2009 07:51 AM
How to ignore comments in regular expression??? katy28 Java 0 02-27-2008 10:05 AM
Dynamically changing the regular expression of Regular Expression validator VSK ASP .Net 2 08-24-2003 02:47 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57