Viktor Rosenfeld wrote:
> Hi,
>
> I want to strip a JAVA file of /* */ like comments. Unfortunately, the
> simple regexp "\/\*.*\*\/" only works on comments, that are on one line.
> Is there a simple way to remove comments that go across several lines with
> python regexp's? I tried re.M to no avail.
>
> Thanks,
> Viktor
Viktor,
Supply the DOTALL flag during the regular expression compile as
described here:
http://www.python.org/doc/current/lib/re-syntax.html
You will also want to make the regular expression non-greedy...the
reasons are quite evident.
>>> import re
>>> import pprint
>>>
>>> st = """
.... /* this is a
.... multi-line comment */
....
.... /* this is a single-line comment */
....
.... /* this /* has multiple
.... starts */
.... """
#non-greedy matching
>>> NonGreedy = re.compile("\/\*.*?\*\/", re.DOTALL)
>>>
>>> pprint.pprint(NonGreedy.findall(st))
['/* this is a\nmulti-line comment */',
'/* this is a single-line comment */',
'/* this /* has multiple\nstarts */']
#greedy matching
>>> Greedy = re.compile("\/\*.*\*\/", re.DOTALL)
>>> pprint.pprint(Greedy.findall(st))
['/* this is a\nmulti-line comment */\n\n/* this is a single-line
comment */\n\n/* this /* has multiple\nstarts */']
- Josiah