Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Strange behavior with iterables - is this a bug?

Reply
Thread Tools

Strange behavior with iterables - is this a bug?

 
 
akameswaran@gmail.com
Guest
Posts: n/a
 
      05-30-2006
Ok, I am confused about this one. I'm not sure if it's a bug or a
feature.. but

>>> ================================ RESTART
>>> f1 = open('word1.txt')
>>> f2 = open('word2.txt')
>>> f3 = open('word3.txt')
>>> print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in f1 for i2 in f2 for i3 in f3]

[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c')]
>>> l1 = ['a\n','b\n','c\n']
>>> l2 = ['a\n','b\n','c\n']
>>>
>>> l3 = ['a\n','b\n','c\n']
>>> print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in l1 for i2 in l2 for i3 in l3]

[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'a'),
('a', 'b', 'b'), ('a', 'b', 'c'), ('a', 'c', 'a'), ('a', 'c', 'b'),
('a', 'c', 'c'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'a', 'c'),
('b', 'b', 'a'), ('b', 'b', 'b'), ('b', 'b', 'c'), ('b', 'c', 'a'),
('b', 'c', 'b'), ('b', 'c', 'c'), ('c', 'a', 'a'), ('c', 'a', 'b'),
('c', 'a', 'c'), ('c', 'b', 'a'), ('c', 'b', 'b'), ('c', 'b', 'c'),
('c', 'c', 'a'), ('c', 'c', 'b'), ('c', 'c', 'c')]

explanation of code: the files word1.txt, word2.txt and word3.txt are
all identical conataining the letters a,b and c one letter per line.
The lists I've added the "\n" so that the lists are identical to what
is returned by the file objects. Just eliminating any possible
differences.


If you notice, when using the file objects I don't get the proper set
of permutations. I was playing around with doing this via recursion,
etc. But nothing was working so I made a simplest case nesting. Still
no go.
Why does this not work with the file objects? Or any other class I''ve
made which implements __iter__ and next?

Seems like a bug to me, but maybe I am missing something. Seems to
happen in 2.3 and 2.4.

 
Reply With Quote
 
 
 
 
Terry Reedy
Guest
Posts: n/a
 
      05-30-2006

<> wrote in message
news: oups.com...
> Ok, I am confused about this one. I'm not sure if it's a bug or a
> feature.. but
>
>>>> ================================ RESTART
>>>> f1 = open('word1.txt')
>>>> f2 = open('word2.txt')
>>>> f3 = open('word3.txt')
>>>> print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in f1 for i2 in f2
>>>> for i3 in f3]

> [('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c')]


A file is something like an iterator and something like an iterable. At
this point, the internal cursur for f3 points at EOF. To reiterate thru
the file, you must rewind in the inner loops. So try (untest by me)

def initf(fil):
f.seek(0)
return f

and ...for i2 in initf(f2) for i3 in initf(f3)


>>>> l1 = ['a\n','b\n','c\n']
>>>> l2 = ['a\n','b\n','c\n']
>>>>
>>>> l3 = ['a\n','b\n','c\n']
>>>> print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in l1 for i2 in l2
>>>> for i3 in l3]

> [('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'a'),
> ('a', 'b', 'b'), ('a', 'b', 'c'), ('a', 'c', 'a'), ('a', 'c', 'b'),
> ('a', 'c', 'c'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'a', 'c'),
> ('b', 'b', 'a'), ('b', 'b', 'b'), ('b', 'b', 'c'), ('b', 'c', 'a'),
> ('b', 'c', 'b'), ('b', 'c', 'c'), ('c', 'a', 'a'), ('c', 'a', 'b'),
> ('c', 'a', 'c'), ('c', 'b', 'a'), ('c', 'b', 'b'), ('c', 'b', 'c'),
> ('c', 'c', 'a'), ('c', 'c', 'b'), ('c', 'c', 'c')]
>
> explanation of code: the files word1.txt, word2.txt and word3.txt are
> all identical conataining the letters a,b and c one letter per line.
> The lists I've added the "\n" so that the lists are identical to what
> is returned by the file objects. Just eliminating any possible
> differences.


But lists are not file objects and you did not eliminate the crucial
difference in reiterability. Try your experiment with StringIO objects,
which are more nearly identical to file objects.

Terry Jan Reedy



 
Reply With Quote
 
 
 
 
Inyeol Lee
Guest
Posts: n/a
 
      05-30-2006
On Tue, May 30, 2006 at 01:11:26PM -0700, wrote:
[...]
> >>> ================================ RESTART
> >>> f1 = open('word1.txt')
> >>> f2 = open('word2.txt')
> >>> f3 = open('word3.txt')
> >>> print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in f1 for i2 in f2 for i3 in f3]

> [('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c')]
> >>> l1 = ['a\n','b\n','c\n']
> >>> l2 = ['a\n','b\n','c\n']
> >>>
> >>> l3 = ['a\n','b\n','c\n']
> >>> print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in l1 for i2 in l2 for i3 in l3]

> [('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'a'),
> ('a', 'b', 'b'), ('a', 'b', 'c'), ('a', 'c', 'a'), ('a', 'c', 'b'),
> ('a', 'c', 'c'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'a', 'c'),
> ('b', 'b', 'a'), ('b', 'b', 'b'), ('b', 'b', 'c'), ('b', 'c', 'a'),
> ('b', 'c', 'b'), ('b', 'c', 'c'), ('c', 'a', 'a'), ('c', 'a', 'b'),
> ('c', 'a', 'c'), ('c', 'b', 'a'), ('c', 'b', 'b'), ('c', 'b', 'c'),
> ('c', 'c', 'a'), ('c', 'c', 'b'), ('c', 'c', 'c')]
>
> explanation of code: the files word1.txt, word2.txt and word3.txt are
> all identical conataining the letters a,b and c one letter per line.
> The lists I've added the "\n" so that the lists are identical to what
> is returned by the file objects. Just eliminating any possible
> differences.


You're comparing file, which is ITERATOR, and list, which is ITERABLE,
not ITERATOR. To get the result you want, use this instead;

>>> print [(i1.strip(),i2.strip(),i3.strip(),)

for i1 in open('word1.txt')
for i2 in open('word2.txt')
for i3 in open('word3.txt')]

FIY, to get the same buggy(?) result using list, try this instead;

>>> l1 = iter(['a\n','b\n','c\n'])
>>> l2 = iter(['a\n','b\n','c\n'])
>>> l3 = iter(['a\n','b\n','c\n'])
>>> print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in l1 for i2 in l2 for i3 in l3]

[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c')]
>>>



-Inyeol Lee
 
Reply With Quote
 
Gary Herron
Guest
Posts: n/a
 
      05-30-2006
wrote:

>Ok, I am confused about this one. I'm not sure if it's a bug or a
>feature.. but
>
>

List comprehension is a great shortcut, but when the shortcut starts
causing trouble, better to go with the old ways. You need to reopen each
file each time you want to iterate through it. You should be able to
understand the difference between these two bits of code.

The first bit opens each file but uses (two of them) multiple times.
Reading from a file at EOF returns an empty sequence.

The second bit opened the file each time you want to reuse it. That
works correctly.

And that suggest the third bit of correctly working code which uses list
comprehension.

# Fails because files are opened once but reused
f1 = open('word1.txt')
f2 = open('word2.txt')
f3 = open('word3.txt')
for i1 in f1:
for i2 in f2:
for i3 in f3:
print (i1.strip(),i2.strip(),i3.strip())

and

# Works because files are reopened for each reuse:
f1 = open('word1.txt')
for i1 in f1:
f2 = open('word2.txt')
for i2 in f2:
f3 = open('word3.txt')
for i3 in f3:
print (i1.strip(),i2.strip(),i3.strip())

and

# Also works because files are reopened for each use:
print [(i1.strip(),i2.strip(),i3.strip())
for i1 in open('word1.txt')
for i2 in open('word2.txt')
for i3 in open('word3.txt')]

Hope that's clear!

Gary Herron





>
>
>>>>================================ RESTART
>>>>f1 = open('word1.txt')
>>>>f2 = open('word2.txt')
>>>>f3 = open('word3.txt')
>>>>print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in f1 for i2 in f2 for i3 in f3]
>>>>
>>>>

>[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c')]
>
>
>>>>l1 = ['a\n','b\n','c\n']
>>>>l2 = ['a\n','b\n','c\n']
>>>>
>>>>l3 = ['a\n','b\n','c\n']
>>>>print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in l1 for i2 in l2 for i3 in l3]
>>>>
>>>>

>[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'a'),
>('a', 'b', 'b'), ('a', 'b', 'c'), ('a', 'c', 'a'), ('a', 'c', 'b'),
>('a', 'c', 'c'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'a', 'c'),
>('b', 'b', 'a'), ('b', 'b', 'b'), ('b', 'b', 'c'), ('b', 'c', 'a'),
>('b', 'c', 'b'), ('b', 'c', 'c'), ('c', 'a', 'a'), ('c', 'a', 'b'),
>('c', 'a', 'c'), ('c', 'b', 'a'), ('c', 'b', 'b'), ('c', 'b', 'c'),
>('c', 'c', 'a'), ('c', 'c', 'b'), ('c', 'c', 'c')]
>
>explanation of code: the files word1.txt, word2.txt and word3.txt are
>all identical conataining the letters a,b and c one letter per line.
>The lists I've added the "\n" so that the lists are identical to what
>is returned by the file objects. Just eliminating any possible
>differences.
>
>
>If you notice, when using the file objects I don't get the proper set
>of permutations. I was playing around with doing this via recursion,
>etc. But nothing was working so I made a simplest case nesting. Still
>no go.
>Why does this not work with the file objects? Or any other class I''ve
>made which implements __iter__ and next?
>
>Seems like a bug to me, but maybe I am missing something. Seems to
>happen in 2.3 and 2.4.
>
>
>


 
Reply With Quote
 
akameswaran@gmail.com
Guest
Posts: n/a
 
      05-30-2006
DOH!!

thanks a lot. had to be something stupid on my part.

Now I get it

 
Reply With Quote
 
akameswaran@gmail.com
Guest
Posts: n/a
 
      05-31-2006

Gary Herron wrote:
> List comprehension is a great shortcut, but when the shortcut starts
> causing trouble, better to go with the old ways. You need to reopen each
> file each time you want to iterate through it. You should be able to
> understand the difference between these two bits of code.
>
> The first bit opens each file but uses (two of them) multiple times.
> Reading from a file at EOF returns an empty sequence.
>
> The second bit opened the file each time you want to reuse it. That
> works correctly.
>
> And that suggest the third bit of correctly working code which uses list
> comprehension.
>
> # Fails because files are opened once but reused
> f1 = open('word1.txt')
> f2 = open('word2.txt')
> f3 = open('word3.txt')
> for i1 in f1:
> for i2 in f2:
> for i3 in f3:
> print (i1.strip(),i2.strip(),i3.strip())
>
> and
>
> # Works because files are reopened for each reuse:
> f1 = open('word1.txt')
> for i1 in f1:
> f2 = open('word2.txt')
> for i2 in f2:
> f3 = open('word3.txt')
> for i3 in f3:
> print (i1.strip(),i2.strip(),i3.strip())
>
> and
>
> # Also works because files are reopened for each use:
> print [(i1.strip(),i2.strip(),i3.strip())
> for i1 in open('word1.txt')
> for i2 in open('word2.txt')
> for i3 in open('word3.txt')]
>
> Hope that's clear!
>
> Gary Herron



My original problem was with recursion. I explicitly nested it out to
try and understand the behavior - and foolishly looked in the wrong
spot for the problem, namely that file is not reitreable. In truth I
was never concerned about file objects, the problem was failing with my
own custom iterators (wich also were not reiterable) and I switched to
file, to eliminate possible code deficiencies on my own part. I was
simply chasing down the wrong problem. As was pointed out to me in a
nother thread - the cleanest implementation which would allow me to use
one copy of the file (in my example the files are identical) would be
to use a trivial iterator class that opens the file, uses tell to track
position and seek to set position, and returns the appropriate line for
that instance - thus eliminating unnecessary file opens and closes.

 
Reply With Quote
 
Gary Herron
Guest
Posts: n/a
 
      05-31-2006
wrote:

>Gary Herron wrote:
>
>
>>List comprehension is a great shortcut, but when the shortcut starts
>>causing trouble, better to go with the old ways. You need to reopen each
>>file each time you want to iterate through it. You should be able to
>>understand the difference between these two bits of code.
>>
>>The first bit opens each file but uses (two of them) multiple times.
>>Reading from a file at EOF returns an empty sequence.
>>
>>The second bit opened the file each time you want to reuse it. That
>>works correctly.
>>
>>And that suggest the third bit of correctly working code which uses list
>>comprehension.
>>
>># Fails because files are opened once but reused
>>f1 = open('word1.txt')
>>f2 = open('word2.txt')
>>f3 = open('word3.txt')
>>for i1 in f1:
>> for i2 in f2:
>> for i3 in f3:
>> print (i1.strip(),i2.strip(),i3.strip())
>>
>>and
>>
>># Works because files are reopened for each reuse:
>>f1 = open('word1.txt')
>>for i1 in f1:
>>f2 = open('word2.txt')
>>for i2 in f2:
>>f3 = open('word3.txt')
>>for i3 in f3:
>>print (i1.strip(),i2.strip(),i3.strip())
>>
>>and
>>
>># Also works because files are reopened for each use:
>>print [(i1.strip(),i2.strip(),i3.strip())
>> for i1 in open('word1.txt')
>> for i2 in open('word2.txt')
>> for i3 in open('word3.txt')]
>>
>>Hope that's clear!
>>
>>Gary Herron
>>
>>

>
>
>My original problem was with recursion. I explicitly nested it out to
>try and understand the behavior - and foolishly looked in the wrong
>spot for the problem, namely that file is not reitreable. In truth I
>was never concerned about file objects, the problem was failing with my
>own custom iterators (wich also were not reiterable) and I switched to
>file, to eliminate possible code deficiencies on my own part. I was
>simply chasing down the wrong problem. As was pointed out to me in a
>nother thread - the cleanest implementation which would allow me to use
>one copy of the file (in my example the files are identical) would be
>to use a trivial iterator class that opens the file, uses tell to track
>position and seek to set position, and returns the appropriate line for
>that instance - thus eliminating unnecessary file opens and closes.
>
>
>

I see.

I wouldn't call "tell" and "seek" clean. Here's another suggestion. Use
l1 = open(...).readlines()
to read the whole file into a (nicely reiterable) list residing in
memory, and then iterate through the list as you wish. Only if your
files are MANY megabytes long would this be a problem with memory
consumption. (But if they were that big, you wouldn't be trying to find
all permutations would you!)

Gary Herron

 
Reply With Quote
 
akameswaran@gmail.com
Guest
Posts: n/a
 
      05-31-2006
My original concern and reason for goint the iterator/generator route
was exactly for large large lists Unnecessary in this example, but
exactly what I was exploring. I wouldn't be using list comprehension
for generating the permutiations. Where all this came from was
creating a generator/iterator to handle very large permutations.



Gary Herron wrote:
> wrote:
>
> >Gary Herron wrote:
> >
> >
> >>List comprehension is a great shortcut, but when the shortcut starts
> >>causing trouble, better to go with the old ways. You need to reopen each
> >>file each time you want to iterate through it. You should be able to
> >>understand the difference between these two bits of code.
> >>
> >>The first bit opens each file but uses (two of them) multiple times.
> >>Reading from a file at EOF returns an empty sequence.
> >>
> >>The second bit opened the file each time you want to reuse it. That
> >>works correctly.
> >>
> >>And that suggest the third bit of correctly working code which uses list
> >>comprehension.
> >>
> >># Fails because files are opened once but reused
> >>f1 = open('word1.txt')
> >>f2 = open('word2.txt')
> >>f3 = open('word3.txt')
> >>for i1 in f1:
> >> for i2 in f2:
> >> for i3 in f3:
> >> print (i1.strip(),i2.strip(),i3.strip())
> >>
> >>and
> >>
> >># Works because files are reopened for each reuse:
> >>f1 = open('word1.txt')
> >>for i1 in f1:
> >>f2 = open('word2.txt')
> >>for i2 in f2:
> >>f3 = open('word3.txt')
> >>for i3 in f3:
> >>print (i1.strip(),i2.strip(),i3.strip())
> >>
> >>and
> >>
> >># Also works because files are reopened for each use:
> >>print [(i1.strip(),i2.strip(),i3.strip())
> >> for i1 in open('word1.txt')
> >> for i2 in open('word2.txt')
> >> for i3 in open('word3.txt')]
> >>
> >>Hope that's clear!
> >>
> >>Gary Herron
> >>
> >>

> >
> >
> >My original problem was with recursion. I explicitly nested it out to
> >try and understand the behavior - and foolishly looked in the wrong
> >spot for the problem, namely that file is not reitreable. In truth I
> >was never concerned about file objects, the problem was failing with my
> >own custom iterators (wich also were not reiterable) and I switched to
> >file, to eliminate possible code deficiencies on my own part. I was
> >simply chasing down the wrong problem. As was pointed out to me in a
> >nother thread - the cleanest implementation which would allow me to use
> >one copy of the file (in my example the files are identical) would be
> >to use a trivial iterator class that opens the file, uses tell to track
> >position and seek to set position, and returns the appropriate line for
> >that instance - thus eliminating unnecessary file opens and closes.
> >
> >
> >

> I see.
>
> I wouldn't call "tell" and "seek" clean. Here's another suggestion. Use
> l1 = open(...).readlines()
> to read the whole file into a (nicely reiterable) list residing in
> memory, and then iterate through the list as you wish. Only if your
> files are MANY megabytes long would this be a problem with memory
> consumption. (But if they were that big, you wouldn't be trying to find
> all permutations would you!)
>
> Gary Herron


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Slicing iterables in sub-generators without loosing elements Thomas Bach Python 19 10-03-2012 03:21 AM
Elementwise -//- first release -//- Element-wise (vectorized)function, method and operator support for iterables in python. Nathan Rice Python 10 12-21-2011 05:22 PM
Re: Elementwise -//- first release -//- Element-wise (vectorized)function, method and operator support for iterables in python. Nathan Rice Python 4 12-21-2011 04:43 PM
Python 3000 idea -- + on iterables -> itertools.chain John Reese Python 10 11-14-2006 12:22 AM
*expression and iterables Steven Bethard Python 0 08-20-2004 06:09 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57