Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Method needed for skipping lines

Reply
Thread Tools

Method needed for skipping lines

 
 
Gustaf
Guest
Posts: n/a
 
      10-31-2007
Hi all,

Just for fun, I'm working on a script to count the number of lines in source files. Some lines are auto-generated (by the IDE) and shouldn't be counted. The auto-generated part of files start with "Begin VB.Form" and end with "End" (first thing on the line). The "End" keyword may appear inside the auto-generated part, but not at the beginning of the line.

I imagine having a flag variable to tell whether you're inside the auto-generated part, but I wasn't able to figure out exactly how. Here's the function, without the ability to skip auto-generated code:

# Count the lines of source code in the file
def count_lines(f):
file = open(f, 'r')
rows = 0
for line in file:
rows = rows + 1
return rows

How would you modify this to exclude lines between "Begin VB.Form" and "End" as described above?

Gustaf
 
Reply With Quote
 
 
 
 
Marc 'BlackJack' Rintsch
Guest
Posts: n/a
 
      10-31-2007
On Wed, 31 Oct 2007 18:02:26 +0100, Gustaf wrote:

> Just for fun, I'm working on a script to count the number of lines in
> source files. Some lines are auto-generated (by the IDE) and shouldn't be
> counted. The auto-generated part of files start with "Begin VB.Form" and
> end with "End" (first thing on the line). The "End" keyword may appear
> inside the auto-generated part, but not at the beginning of the line.
>
> I imagine having a flag variable to tell whether you're inside the
> auto-generated part, but I wasn't able to figure out exactly how. Here's
> the function, without the ability to skip auto-generated code:
>
> # Count the lines of source code in the file def count_lines(f):
> file = open(f, 'r')
> rows = 0
> for line in file:
> rows = rows + 1
> return rows
>
> How would you modify this to exclude lines between "Begin VB.Form" and
> "End" as described above?


Introduce the flag and look up the docs for the `startswith()` method on
strings.

Ciao,
Marc 'BlackJack' Rintsch
 
Reply With Quote
 
 
 
 
Yu-Xi Lim
Guest
Posts: n/a
 
      10-31-2007
Gustaf wrote:
> Hi all,
>
> Just for fun, I'm working on a script to count the number of lines in
> source files. Some lines are auto-generated (by the IDE) and shouldn't
> be counted. The auto-generated part of files start with "Begin VB.Form"
> and end with "End" (first thing on the line). The "End" keyword may
> appear inside the auto-generated part, but not at the beginning of the
> line.
>
> I imagine having a flag variable to tell whether you're inside the
> auto-generated part, but I wasn't able to figure out exactly how. Here's
> the function, without the ability to skip auto-generated code:
>
> # Count the lines of source code in the file
> def count_lines(f):
> file = open(f, 'r')
> rows = 0
> for line in file:
> rows = rows + 1
> return rows
>
> How would you modify this to exclude lines between "Begin VB.Form" and
> "End" as described above?
> Gustaf


David Mertz's Text Processing in Python might give you some more
efficient (and interesting) ways of approaching the problem.

http://gnosis.cx/TPiP/
 
Reply With Quote
 
Bruno Desthuilliers
Guest
Posts: n/a
 
      10-31-2007
Gustaf a écrit :
> Hi all,
>
> Just for fun, I'm working on a script to count the number of lines in
> source files. Some lines are auto-generated (by the IDE) and shouldn't
> be counted. The auto-generated part of files start with "Begin VB.Form"
> and end with "End" (first thing on the line). The "End" keyword may
> appear inside the auto-generated part, but not at the beginning of the
> line.
>
> I imagine having a flag variable to tell whether you're inside the
> auto-generated part, but I wasn't able to figure out exactly how. Here's
> the function, without the ability to skip auto-generated code:
>
> # Count the lines of source code in the file
> def count_lines(f):
> file = open(f, 'r')


1/ The param name is not very explicit.
2/ You're shadowing the builtin file type.
3/ It migh be better to pass an opened file object instead - this would
make your function more generic (ok, perhaps a bit overkill here, but
still a better practice IMHO).

> rows = 0


Shouldn't that be something like 'line_count' ?

> for line in file:
> rows = rows + 1


Use augmented assignment instead:
rows += 1

> return rows


You forgot to close the file.

> How would you modify this to exclude lines between "Begin VB.Form" and
> "End" as described above?


Here's a straightforward solution:

def count_loc(path):
loc_count = 0
in_form = False
opened_file = open(path)
try:
# striping lines, and skipping blank lines
for line in opened_file:
line = line.strip()
# skipping blank lines
if not line:
continue
# skipping VB comments
# XXX: comment mark should not be hardcoded
if line.startswith(';'):
continue
# skipping autogenerated code
if line.startswith("Begin VB.Form"):
in_form = True
continue
elif in_form:
if line.startswith("End"):
in_form = False
continue
# Still here ? ok, we count this one
loc_count += 1
finally:
opened_file.close()
return loc_count

HTH

PS : If you prefer a more functional approach
(warning: the following code may permanently damage innocent minds):

def chain(*predicates):
def _chained(arg):
for p in predicates:
if not p(arg):
return False
return True
return _chained

def not_(predicate):
def _not_(arg):
return not predicate(arg)
return _not_

class InGroupPredicate(object):
def __init__(self, begin_group, end_group):
self.in_group = False
self.begin_group = begin_group
self.end_group = end_group

def __call__(self, line):
if self.begin_group(line):
self.in_group = True
return True
elif self.in_group and self.end_group(line):
self.in_group = False
return True # this one too is part of the group
return self.in_group

def count_locs(lines, count_line):
return len(filter(
chain(lambda line: bool(line), count_line),
map(str.strip,lines)
))

def count_vb_locs(lines):
return count_locs(lines, chain(
not_(InGroupPredicate(
lambda line: line.startswith('Begin VB.Form'),
lambda line: line.startswith('End')
)),
lambda line: not line.startswith(';')
))

# and finally our count_lines function, greatly simplified !-)
def count_lines(path):
f = open(path)
try:
return count_vb_locs(f)
finally:
f.close()

(anyone on doing it with itertools ?-)
 
Reply With Quote
 
Paul Hankin
Guest
Posts: n/a
 
      11-01-2007
On Oct 31, 5:02 pm, Gustaf <(E-Mail Removed)> wrote:
> Hi all,
>
> Just for fun, I'm working on a script to count the number of lines in source files. Some lines are auto-generated (by the IDE) and shouldn't be counted. The auto-generated part of files start with "Begin VB.Form" and end with "End" (first thing on the line). The "End" keyword may appear inside the auto-generated part, but not at the beginning of the line.
>
> I imagine having a flag variable to tell whether you're inside the auto-generated part, but I wasn't able to figure out exactly how. Here's the function, without the ability to skip auto-generated code:
>
> # Count the lines of source code in the file
> def count_lines(f):
> file = open(f, 'r')
> rows = 0
> for line in file:
> rows = rows + 1
> return rows
>
> How would you modify this to exclude lines between "Begin VB.Form" and "End" as described above?


First, your function can be written much more compactly:
def count_lines(f):
return len(open(f, 'r'))


Anyway, to answer your question, write a function that omits the lines
you want excluded:

def omit_generated_lines(lines):
in_generated = False
for line in lines:
line = line.strip()
in_generated = in_generated or line.starts_with('Begin
VB.Form')
if not in_generated:
yield line
in_generated = in_generated and not line.starts_with('End')

And count the remaining ones...

def count_lines(filename):
return len(omit_generated_lines(open(filename, 'r')))

--
Paul Hankin

 
Reply With Quote
 
Anand
Guest
Posts: n/a
 
      11-01-2007
On Nov 1, 5:04 am, Paul Hankin <(E-Mail Removed)> wrote:
> On Oct 31, 5:02 pm, Gustaf <(E-Mail Removed)> wrote:
>
> > Hi all,

>
> > Just for fun, I'm working on a script to count the number of lines in source files. Some lines are auto-generated (by the IDE) and shouldn't be counted. The auto-generated part of files start with "Begin VB.Form" and end with "End" (first thing on the line). The "End" keyword may appear inside the auto-generated part, but not at the beginning of the line.


I think we can take help of regular expressions.

import re

rx = re.compile('^Begin VB.Form.*^End\n', re.DOTALL|re.MULTILINE)

def count(filename)
text = open(filename).read()
return rx.sub('', text).count('\n')

 
Reply With Quote
 
Gustaf
Guest
Posts: n/a
 
      11-01-2007
Yu-Xi Lim wrote:

> David Mertz's Text Processing in Python might give you some more
> efficient (and interesting) ways of approaching the problem.
>
> http://gnosis.cx/TPiP/


Thank you for the link. Looks like a great resource.

Gustaf
 
Reply With Quote
 
Gustaf
Guest
Posts: n/a
 
      11-01-2007
Bruno Desthuilliers wrote:

> Here's a straightforward solution:


<snip/>

Thank you. I learned several things from that.

Gustaf
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Each Array Method Skipping first Array position Chris R. Ruby 3 01-28-2011 04:00 PM
skipping the lines friend.blah@googlemail.com C++ 3 06-09-2008 01:15 PM
Parsing a text file line-by-line: skipping badly-formed lines? denis.papathanasiou@gmail.com Perl Misc 27 05-18-2007 07:07 PM
python skipping lines? lisa.engblom@gmail.com Python 6 11-27-2006 07:35 PM
Asp.Net Calender, how to display 5 lines if there are only 5 lines in one month? Jack ASP .Net 9 10-12-2005 03:44 AM



Advertisments