Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Regular Expressions Problem

Reply
Thread Tools

Regular Expressions Problem

 
 
Oriana
Guest
Posts: n/a
 
      09-09-2004
Hi!

I'm trying to 'clean up' this source file using regular expressions
in Python. My problem is, that when I try to delete extra lines, my
code fails. Here's an example....

/**
*
* Project: MyProject
*
*
*
*
*
*
*
* Description:
*
* This file contains the some code.
*
* Public Functions:
*
* function_1
* function_2
*
* Private Functions:
*
* None.
*
*
* Notes:
*
* None.
*
*
*
************************************************** ***********************/


......I would like my code to only have one * space between lines, and
not all that white space that I see there. I tried to use the regular
expression: '^\*\n$^\*\n$' but that does not work. I've tried a bunch
of things and none of them seem to work....please help!!! Thanks in
advance, Oriana
 
Reply With Quote
 
 
 
 
Kirk Job-Sluder
Guest
Posts: n/a
 
      09-09-2004
On 2004-09-09, Oriana <(E-Mail Removed)> wrote:
> .....I would like my code to only have one * space between lines, and
> not all that white space that I see there. I tried to use the regular
> expression: '^\*\n$^\*\n$' but that does not work. I've tried a bunch
> of things and none of them seem to work....please help!!! Thanks in
> advance, Oriana


Hrm, some suggestions.

First, you need to set the MULTILINE mode on the regular expression
object. You can do this with re.compile(pattern,re.MULTILINE).

Secondly, the "$" character matches just before the newline. So it
should be '^\*$\n'. In MULTILINE mode

Third, the regex you have here will reduce two blank comment lines to
one.

Try this:
>>> string = """*

.... *
.... *
.... *
.... *
.... *
.... *
.... """
>>> foo = re.compile(r'(^\*\s*\n){2,}',re.MULTILINE)
>>> foo.sub("*\n",string)

'*\n'

The blank comment line is described by (^\*\s*\n) (asterisk at the start
of a line, followed by 0 or more space characters, then a newline).
The {2,} says "match two or more of this group."

Also, I can't really overrecommend "Mastering Regular Expressions" as a
good book for regular expression users:
http://www.oreilly.com/catalog/regex/

There is also a nice python-centric regex page at:
http://www.amk.ca/python/howto/regex/


--
Kirk Job-Sluder
"The square-jawed homunculi of Tommy Hilfinger ads make every day an
existential holocaust." --Scary Go Round
 
Reply With Quote
 
 
 
 
Andrew Dalke
Guest
Posts: n/a
 
      09-09-2004
Oriana wrote:
> Hi!
>
> I'm trying to 'clean up' this source file using regular expressions
> in Python. My problem is, that when I try to delete extra lines, my
> code fails. Here's an example....


You probably need the re.MULTILINE flag. This worked for me

>>> import re
>>> pat = re.compile(r"^\*\s*\n(^\*\s*\n)+", re.MULTILINE)
>>> text = """/**

.... *
.... * Project: MyProject
.... *
.... *
.... *
.... *
.... *
.... *
.... *
.... * Description:
.... *
.... * This file contains the some code.
.... *
.... * Public Functions:
.... *
.... * function_1
.... * function_2
.... *
.... * Private Functions:
.... *
.... * None.
.... *
.... *
.... * Notes:
.... """
>>> print pat.sub("*\n", text)

/**
*
* Project: MyProject
*
* Description:
*
* This file contains the some code.
*
* Public Functions:
*
* function_1
* function_2
*
* Private Functions:
*
* None.
*
* Notes:

>>>


Andrew
http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
Brian Szmyd
Guest
Posts: n/a
 
      09-10-2004
Oriana wrote:

> Hi!
>
> I'm trying to 'clean up' this source file using regular expressions
> in Python. My problem is, that when I try to delete extra lines, my
> code fails. Here's an example....
>
> /**
> *
> * Project: MyProject
> *
> *
> *
> *
> *
> *
> *
> * Description:
> *
> * This file contains the some code.
> *
> * Public Functions:
> *
> * function_1
> * function_2
> *
> * Private Functions:
> *
> * None.
> *
> *
> * Notes:
> *
> * None.
> *
> *
> *
> ************************************************** ***********************/
>
>
> .....I would like my code to only have one * space between lines, and
> not all that white space that I see there. I tried to use the regular
> expression: '^\*\n$^\*\n$' but that does not work. I've tried a bunch
> of things and none of them seem to work....please help!!! Thanks in
> advance, Oriana


Are you reading in the file line by line? If so, why not just have a flag
that states you've seen a empty line, and then if the flag is true, do not
output any more empty lines till you see a non-emtpy line?

Pseudo-Code:

fp = openfile($filename)
op = openfine($newfile)

emptyline = 0
while(not fp.eof())
line = fp.readline()

if (isEmpty(line))
if(emptyline) continue
emptyline = 1
else
emptyline = 0

op.writeline(line)

Of course you'll have to decide how you want isEmpty() to decide if a string
is an empty line, but this should be pretty painless.

-regards
brian szmyd
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular expressions (multiple match problem) mikko.n C Programming 5 04-02-2008 10:14 PM
Re: Regular Expressions: Can't quite figure this problem out Gabriel Genellina Python 2 09-25-2007 03:48 PM
XML Schema pattern problem with regular expressions Jeff XML 1 02-25-2005 08:18 PM
problem with \s in unicoded regular expressions Sergei Olonichev Ruby 3 10-28-2003 07:26 AM
Add custom regular expressions to the validation list of available expressions Jay Douglas ASP .Net 0 08-15-2003 10:19 PM



Advertisments