Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > F: How can I make re.sub() replace patterns across newlines

Reply
Thread Tools

F: How can I make re.sub() replace patterns across newlines

 
 
Viktor Rosenfeld
Guest
Posts: n/a
 
      02-02-2004
Hi,

I want to strip a JAVA file of /* */ like comments. Unfortunately, the
simple regexp "\/\*.*\*\/" only works on comments, that are on one line.
Is there a simple way to remove comments that go across several lines with
python regexp's? I tried re.M to no avail.

Thanks,
Viktor
 
Reply With Quote
 
 
 
 
Karl =?iso-8859-1?q?Pfl=E4sterer?=
Guest
Posts: n/a
 
      02-02-2004
Viktor Rosenfeld <- http://www.velocityreviews.com/forums/(E-Mail Removed)-berlin.de wrote:

> I want to strip a JAVA file of /* */ like comments. Unfortunately, the
> simple regexp "\/\*.*\*\/" only works on comments, that are on one line.
> Is there a simple way to remove comments that go across several lines with
> python regexp's? I tried re.M to no avail.


You must use re.S

,----[ Python lib reference ]
| `S'
|
| `DOTALL'
| Make the `.' special character match any character at all,
| including a newline; without this flag, `.' will match anything
| _except_ a newline.
`----


KP

--
Männer der Wissenschaft! Man sagt ihr viele nach,
aber die meisten mit Unrecht.
Karl Kraus 'Aphorismen'
 
Reply With Quote
 
 
 
 
Josiah Carlson
Guest
Posts: n/a
 
      02-02-2004
Viktor Rosenfeld wrote:

> Hi,
>
> I want to strip a JAVA file of /* */ like comments. Unfortunately, the
> simple regexp "\/\*.*\*\/" only works on comments, that are on one line.
> Is there a simple way to remove comments that go across several lines with
> python regexp's? I tried re.M to no avail.
>
> Thanks,
> Viktor


Viktor,

Supply the DOTALL flag during the regular expression compile as
described here: http://www.python.org/doc/current/lib/re-syntax.html

You will also want to make the regular expression non-greedy...the
reasons are quite evident.

>>> import re
>>> import pprint
>>>
>>> st = """

.... /* this is a
.... multi-line comment */
....
.... /* this is a single-line comment */
....
.... /* this /* has multiple
.... starts */
.... """
#non-greedy matching
>>> NonGreedy = re.compile("\/\*.*?\*\/", re.DOTALL)
>>>
>>> pprint.pprint(NonGreedy.findall(st))

['/* this is a\nmulti-line comment */',
'/* this is a single-line comment */',
'/* this /* has multiple\nstarts */']

#greedy matching
>>> Greedy = re.compile("\/\*.*\*\/", re.DOTALL)
>>> pprint.pprint(Greedy.findall(st))

['/* this is a\nmulti-line comment */\n\n/* this is a single-line
comment */\n\n/* this /* has multiple\nstarts */']

- Josiah
 
Reply With Quote
 
Hans Nowak
Guest
Posts: n/a
 
      02-02-2004
Viktor Rosenfeld wrote:
> Hi,
>
> I want to strip a JAVA file of /* */ like comments. Unfortunately, the
> simple regexp "\/\*.*\*\/" only works on comments, that are on one line.
> Is there a simple way to remove comments that go across several lines with
> python regexp's? I tried re.M to no avail.


Something like:

import re
pattern = re.compile("/\*.*?\*/", re.MULTILINE|re.DOTALL)
stripped_data = pattern.sub("", data)

Note that I added a ? to the regex, so it won't be "greedy".

HTH,

--
Hans ((E-Mail Removed))
http://zephyrfalcon.org/



 
Reply With Quote
 
Viktor Rosenfeld
Guest
Posts: n/a
 
      02-02-2004
Thanks to all that were quick to answer, using re.DOTALL indeed solves the
problem. I was too tired to read the documentation correctly.

Ciao,
Viktor

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
replace c-style comments with newlines (regexp) lex __ Python 3 12-21-2007 02:06 PM
how do I replace with JS newlines in textareas with <br> ???? Tamer Higazi Javascript 3 08-06-2007 08:29 AM
replace all newlines with <br> tags lkrubner@geocities.com Javascript 4 01-30-2005 10:35 PM
where to find good patterns and sources of patterns (was Re: singletons) crichmon C++ 4 07-07-2004 10:02 PM



Advertisments