Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Is there a better/simpler way to filter blank lines?

Reply
Thread Tools

Is there a better/simpler way to filter blank lines?

 
 
Marc 'BlackJack' Rintsch
Guest
Posts: n/a
 
      11-05-2008
On Wed, 05 Nov 2008 12:06:42 +1100, Ben Finney wrote:

> Falcolas <(E-Mail Removed)> writes:
>
>> Using the surrounding parentheses creates a generator object

>
> No. Using the generator expression syntax creates a generator object.
>
> Parentheses are irrelevant to whether the expression is a generator
> expression. The parentheses merely group the expression from surrounding
> syntax.


No they are important:

In [270]: a = x for x in xrange(10)
------------------------------------------------------------
File "<ipython console>", line 1
a = x for x in xrange(10)
^
<type 'exceptions.SyntaxError'>: invalid syntax


In [271]: a = (x for x in xrange(10))

Ciao,
Marc 'BlackJack' Rintsch
 
Reply With Quote
 
 
 
 
Steven D'Aprano
Guest
Posts: n/a
 
      11-05-2008
On Tue, 04 Nov 2008 20:25:09 -0500, Steve Holden wrote:

> I think there'd be no advantage to a sort method on a generator, since
> theoretically the last item could be the first required in the sorted
> sequence, so it's necessary to hold all items in memory to ensure the
> sort is correct. So there's no point using a generator in the first
> place.



You can't sort something lazily.

Actually, that's not *quite* true: it only holds for comparison sorts.
You can sort lazily using non-comparison sorts, such as Counting Sort:

http://en.wikipedia.org/wiki/Counting_sort

Arguably, the benefit of giving generators a sort() method would be to
avoid an explicit call to list. But I think many people would argue that
was actually a disadvantage, not a benefit, and that the call to list is
a good thing. I'd agree with them.

However, sorted() should take a generator argument, and in fact I see it
does:

>>> sorted( x+1 for x in (4, 2, 0, 3, 1) )

[1, 2, 3, 4, 5]



--
Steven
 
Reply With Quote
 
 
 
 
Marc 'BlackJack' Rintsch
Guest
Posts: n/a
 
      11-05-2008
On Wed, 05 Nov 2008 13:18:27 +1100, Ben Finney wrote:

> Marc 'BlackJack' Rintsch <(E-Mail Removed)> writes:
>
> Your example shows only that they're important for grouping the
> expression from surrounding syntax. As I said.
>
> They are *not* important for making the expresison be a generator
> expression in the first place. Parentheses are irrelevant for the
> generator expression syntax.


Okay, technically correct but parenthesis belong to generator expressions
because they have to be there to separate them from surrounding syntax
with the exception when there are already enclosing parentheses. So
parenthesis are tied to generator expression syntax.

Ciao,
Marc 'BlackJack' Rintsch
 
Reply With Quote
 
Marc 'BlackJack' Rintsch
Guest
Posts: n/a
 
      11-05-2008
On Wed, 05 Nov 2008 14:39:36 +1100, Ben Finney wrote:

> Marc 'BlackJack' Rintsch <(E-Mail Removed)> writes:
>
>> On Wed, 05 Nov 2008 13:18:27 +1100, Ben Finney wrote:
>>
>> > Marc 'BlackJack' Rintsch <(E-Mail Removed)> writes:
>> >
>> > Your example shows only that they're important for grouping the
>> > expression from surrounding syntax. As I said.
>> >
>> > They are *not* important for making the expresison be a generator
>> > expression in the first place. Parentheses are irrelevant for the
>> > generator expression syntax.

>>
>> Okay, technically correct but parenthesis belong to generator
>> expressions because they have to be there to separate them from
>> surrounding syntax with the exception when there are already enclosing
>> parentheses. So parenthesis are tied to generator expression syntax.

>
> No, I think that's factually wrong *and* confusing.
>
> >>> list(i + 7 for i in range(10))

> [7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
>
> Does this demonstrate that parentheses are “tied to” integer literal
> syntax? No.


You can use integer literals without parenthesis, like the 7 above, but
you can't use generator expressions without them. They are always
there. In that way parenthesis are tied to generator expressions.

If I see the pattern ``f(x) for x in obj if c(x)`` I look if it is
enclosed in parenthesis or brackets to decide if it is a list
comprehension or a generator expression. That may not reflect the formal
grammar, but it is IMHO the easiest and pragmatic way to look at this as
a human programmer.

Ciao,
Marc 'BlackJack' Rintsch
 
Reply With Quote
 
Jorgen Grahn
Guest
Posts: n/a
 
      11-05-2008
On Tue, 04 Nov 2008 15:36:23 -0600, Larry Bates <(E-Mail Removed)> wrote:
> http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
>> tmallen:
>>> I'm parsing some text files, and I want to strip blank lines in the
>>> process. Is there a simpler way to do this than what I have here?
>>> lines = filter(lambda line: len(line.strip()) > 0, lines)

....

> Of if you want to filter/loop at the same time, or if you don't want all the
> lines in memory at the same time:


Or if you want to support potentially infinite input streams, such as
a pipe or socket. There are many reasons this is my preferred way of
going through a text file.

> fp = open(filename, 'r')
> for line in fp:
> if not line.strip():
> continue
>
> #
> # Do something with the non-blank like:
> #
>
>
> fp.close()


Often, you want to at least rstrip() all lines anyway,
for other reasons, and then the extra cost is even less:

line = line.rstrip()
if not line: continue
# do something with the rstripped, nonblank lines

/Jorgen

--
// Jorgen Grahn <grahn@ Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se> R'lyeh wgah'nagl fhtagn!
 
Reply With Quote
 
tmallen
Guest
Posts: n/a
 
      11-05-2008
Why do I feel like the coding style in Lutz' "Programming Python" is
very far from idiomatic Python? The content feels dated, and I find
that most answers that I get for Python questions use a different
style from the sort of code I see in this book.

Thomas

On Nov 5, 7:15*am, Jorgen Grahn <(E-Mail Removed)> wrote:
> On Tue, 04 Nov 2008 15:36:23 -0600, Larry Bates <(E-Mail Removed)> wrote:
> > (E-Mail Removed) wrote:
> >> tmallen:
> >>> I'm parsing some text files, and I want to strip blank lines in the
> >>> process. Is there a simpler way to do this than what I have here?
> >>> lines = filter(lambda line: len(line.strip()) > 0, lines)

>
> ...
>
> > Of if you want to filter/loop at the same time, or if you don't want all the
> > lines in memory at the same time:

>
> Or if you want to support potentially infinite input streams, such as
> a pipe or socket. *There are many reasons this is my preferred way of
> going through a text file.
>
> > fp = open(filename, 'r')
> > for line in fp:
> > * * *if not line.strip():
> > * * * * *continue

>
> > * * *#
> > * * *# Do something with the non-blank like:
> > * * *#

>
> > fp.close()

>
> Often, you want to at least rstrip() all lines anyway,
> for other reasons, and then the extra cost is even less:
>
> * * * *line = line.rstrip()
> * * * *if not line: continue
> * * * *# do something with the rstripped, nonblank lines
>
> /Jorgen
>
> --
> * // Jorgen Grahn <grahn@ * * * *Ph'nglui mglw'nafh Cthulhu
> \X/ * * snipabacken.se> * * * * *R'lyeh wgah'nagl fhtagn!


 
Reply With Quote
 
Lie
Guest
Posts: n/a
 
      11-05-2008
On Nov 5, 4:56*pm, Marc 'BlackJack' Rintsch <(E-Mail Removed)> wrote:
> On Wed, 05 Nov 2008 14:39:36 +1100, Ben Finney wrote:
> > Marc 'BlackJack' Rintsch <(E-Mail Removed)> writes:

>
> >> On Wed, 05 Nov 2008 13:18:27 +1100, Ben Finney wrote:

>
> >> > Marc 'BlackJack' Rintsch <(E-Mail Removed)> writes:

>
> >> > Your example shows only that they're important for grouping the
> >> > expression from surrounding syntax. As I said.

>
> >> > They are *not* important for making the expresison be a generator
> >> > expression in the first place. Parentheses are irrelevant for the
> >> > generator expression syntax.

>
> >> Okay, technically correct but parenthesis belong to generator
> >> expressions because they have to be there to separate them from
> >> surrounding syntax with the exception when there are already enclosing
> >> parentheses. *So parenthesis are tied to generator expression syntax..

>
> > No, I think that's factually wrong *and* confusing.

>
> > * * >>> list(i + 7 for i in range(10))
> > * * [7, 8, 9, 10, 11, 12, 13, 14, 15, 16]

>
> > Does this demonstrate that parentheses are tied to integer literal
> > syntax? No.

>
> You can use integer literals without parenthesis, like the 7 above, but
> you can't use generator expressions without them. *They are always
> there. *In that way parenthesis are tied to generator expressions.
>
> If I see the pattern ``f(x) for x in obj if c(x)`` I look if it is
> enclosed in parenthesis or brackets to decide if it is a list
> comprehension or a generator expression. *That may not reflect the formal
> grammar, but it is IMHO the easiest and pragmatic way to look at this as
> a human programmer.
>
> Ciao,
> * * * * Marc 'BlackJack' Rintsch


The situation is similar to tuples. What makes a tuple is the commas,
not the parens.
What makes a generator expression is "<exp> for <var-or-tuple> in
<exp>".

Parenthesis is generally required because without it, it's almost
impossible to differentiate it with the surrounding. But it is not
part of the formally required syntax.
 
Reply With Quote
 
Arnaud Delobelle
Guest
Posts: n/a
 
      11-05-2008
Lie <(E-Mail Removed)> writes:
> What makes a generator expression is "<exp> for <var-or-tuple> in
> <exp>".
>
> Parenthesis is generally required because without it, it's almost
> impossible to differentiate it with the surrounding. But it is not
> part of the formally required syntax.


.... But *every* generator expression is surrounded by parentheses, isn't
it?

--
Arnaud
 
Reply With Quote
 
Steven D'Aprano
Guest
Posts: n/a
 
      11-05-2008
On Wed, 05 Nov 2008 21:23:57 +0000, Arnaud Delobelle wrote:

> Lie <(E-Mail Removed)> writes:
>> What makes a generator expression is "<exp> for <var-or-tuple> in
>> <exp>".
>>
>> Parenthesis is generally required because without it, it's almost
>> impossible to differentiate it with the surrounding. But it is not part
>> of the formally required syntax.

>
> ... But *every* generator expression is surrounded by parentheses, isn't
> it?


Yes, but sometimes they are there in order to call a function, not to
form the generator expression.

I'm surprised that nobody yet has RTFM:

http://docs.python.org/reference/expressions.html

[quote]
A generator expression is a compact generator notation in parentheses:

generator_expression ::= "(" expression genexpr_for ")"
genexpr_for ::= "for" target_list "in" or_test [genexpr_iter]
genexpr_iter ::= genexpr_for | genexpr_if
genexpr_if ::= "if" old_expression [genexpr_iter]

....
The parentheses can be omitted on calls with only one argument.
[end quote]

It seems to me that the FM says that the parentheses *are* part of the
syntax for a generator expression, but if some other syntactic construct
(e.g. a function call) provides the parentheses, then you don't need to
supply a second, redundant, pair.

I believe that this is the definitive answer, short of somebody reading
the source code and claiming the documentation is wrong.



--
Steven
 
Reply With Quote
 
Miles
Guest
Posts: n/a
 
      11-05-2008
Ben Finney wrote:
> Falcolas writes:
>
>> Using the surrounding parentheses creates a generator object

>
> No. Using the generator expression syntax creates a generator object.
>
> Parentheses are irrelevant to whether the expression is a generator
> expression. The parentheses merely group the expression from
> surrounding syntax.


As others have pointed out, the parentheses are part of the generator
syntax. If not for the parentheses, a list comprehension would be
indistinguishable from a list literal with a single element, a
generator object. It's also worth remembering that list
comprehensions are distinct from generator expressions and don't
require the creation of a generator object.

-Miles
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
blank CD-R and blank DVD-R not recognized by Vista 64 Ultimate =?Utf-8?B?R3JlZyBLaXJrcGF0cmljaw==?= Windows 64bit 13 11-07-2007 12:23 PM
Polarising filter with UV filter? Stimp Digital Photography 23 11-17-2006 11:51 AM
to filter of not to filter Ken Digital Photography 2 12-23-2005 12:45 PM
UV Protector filter vs. Skylight filter? john Digital Photography 8 06-26-2004 03:44 PM



Advertisments