Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Q on explicitly calling file.close

Reply
Thread Tools

Q on explicitly calling file.close

 
 
kj
Guest
Posts: n/a
 
      09-05-2009




There's something wonderfully clear about code like this:

# (1)
def spam(filename):
for line in file(filename):
do_something_with(line)

It is indeed pseudo-codely beautiful. But I gather that it is not
correct to do this, and that instead one should do something like

# (2)
def spam(filename):
fh = file(filename)
try:
for line in fh:
do_something_with(line)
finally:
fh.close()

....or alternatively, if the with-statement is available:

# (3)
def spam(filename):
with file(filename) as fh:
for line in fh:
do_something_with(line)

Mind you, (3) is almost as simple as (1) (only one additional line),
but somehow it lacks (1)'s direct simplicity. (And it adds one
more indentation level, which I find annoying.) Furthermore, I
don't recall ever coming across either (2) or (3) "in the wild",
even after reading a lot of high-quality Python code (e.g. standard
library modules).

Finally, I was under the impression that Python closed filehandles
automatically when they were garbage-collected. (In fact (3)
suggests as much, since it does not include an implicit call to
fh.close.) If so, the difference between (1) and (3) does not seem
very big. What am I missing here?

kynn
 
Reply With Quote
 
 
 
 
MRAB
Guest
Posts: n/a
 
      09-05-2009
kj wrote:
>
>
>
> There's something wonderfully clear about code like this:
>
> # (1)
> def spam(filename):
> for line in file(filename):
> do_something_with(line)
>
> It is indeed pseudo-codely beautiful. But I gather that it is not
> correct to do this, and that instead one should do something like
>
> # (2)
> def spam(filename):
> fh = file(filename)
> try:
> for line in fh:
> do_something_with(line)
> finally:
> fh.close()
>
> ...or alternatively, if the with-statement is available:
>
> # (3)
> def spam(filename):
> with file(filename) as fh:
> for line in fh:
> do_something_with(line)
>
> Mind you, (3) is almost as simple as (1) (only one additional line),
> but somehow it lacks (1)'s direct simplicity. (And it adds one
> more indentation level, which I find annoying.) Furthermore, I
> don't recall ever coming across either (2) or (3) "in the wild",
> even after reading a lot of high-quality Python code (e.g. standard
> library modules).
>
> Finally, I was under the impression that Python closed filehandles
> automatically when they were garbage-collected. (In fact (3)
> suggests as much, since it does not include an implicit call to
> fh.close.) If so, the difference between (1) and (3) does not seem
> very big. What am I missing here?
>

CPython uses reference counting, so an object is garbage collected as
soon as there are no references to it, but that's just an implementation
detail.

Other implementations, such as Jython and IronPython, don't use
reference counting, so you don't know when an object will be garbage
collected, which means that the file might remain open for an unknown
time afterwards in case 1 above.

Most people use CPython, so it's not surprising that case 1 is so
common.
 
Reply With Quote
 
 
 
 
Dave Angel
Guest
Posts: n/a
 
      09-05-2009
kj wrote:
> There's something wonderfully clear about code like this:
>
> # (1)
> def spam(filename):
> for line in file(filename):
> do_something_with(line)
>
> It is indeed pseudo-codely beautiful. But I gather that it is not
> correct to do this, and that instead one should do something like
>
> # (2)
> def spam(filename):
> fh = file(filename)
> try:
> for line in fh:
> do_something_with(line)
> finally:
> fh.close()
>
> ...or alternatively, if the with-statement is available:
>
> # (3)
> def spam(filename):
> with file(filename) as fh:
> for line in fh:
> do_something_with(line)
>
> Mind you, (3) is almost as simple as (1) (only one additional line),
> but somehow it lacks (1)'s direct simplicity. (And it adds one
> more indentation level, which I find annoying.) Furthermore, I
> don't recall ever coming across either (2) or (3) "in the wild",
> even after reading a lot of high-quality Python code (e.g. standard
> library modules).
>
> Finally, I was under the impression that Python closed filehandles
> automatically when they were garbage-collected. (In fact (3)
> suggests as much, since it does not include an implicit call to
> fh.close.) If so, the difference between (1) and (3) does not seem
> very big. What am I missing here?
>
> kynn
>
>

We have to distinguish between reference counted and garbage collected.
As MRAB says, when the reference count goes to zero, the file is
immediately closed, in CPython implementation. So all three are
equivalent on that platform.

But if you're not sure the code will run on CPython, then you have to
have something that explicitly catches the out-of-scopeness of the file
object. Both your (2) and (3) do that, with different syntaxes.

DaveA

 
Reply With Quote
 
r
Guest
Posts: n/a
 
      09-05-2009
On Sep 5, 1:17*pm, Dave Angel <(E-Mail Removed)> wrote:
> kj wrote:
> > There's something wonderfully clear about code like this:

>
> > * * # (1)
> > * * def spam(filename):
> > * * * * for line in file(filename):
> > * * * * * * do_something_with(line)

>
> > It is indeed pseudo-codely beautiful. *But I gather that it is not
> > correct to do this, and that instead one should do something like

>
> > * * # (2)
> > * * def spam(filename):
> > * * * * fh = file(filename)
> > * * * * try:
> > * * * * * * for line in fh:
> > * * * * * * * * do_something_with(line)
> > * * * * finally:
> > * * * * * * fh.close()

>
> > ...or alternatively, if the with-statement is available:

>
> > * * # (3)
> > * * def spam(filename):
> > * * * * with file(filename) as fh:
> > * * * * * * for line in fh:
> > * * * * * * * * do_something_with(line)

>
> > Mind you, (3) is almost as simple as (1) (only one additional line),
> > but somehow it lacks (1)'s direct simplicity. *(And it adds one
> > more indentation level, which I find annoying.) *Furthermore, I
> > don't recall ever coming across either (2) or (3) "in the wild",
> > even after reading a lot of high-quality Python code (e.g. standard
> > library modules).

>
> > Finally, I was under the impression that Python closed filehandles
> > automatically when they were garbage-collected. *(In fact (3)
> > suggests as much, since it does not include an implicit call to
> > fh.close.) If so, the difference between (1) and (3) does not seem
> > very big. *What am I missing here?

>
> > kynn

>
> We have to distinguish between reference counted and garbage collected. *
> As MRAB says, when the reference count goes to zero, the file is
> immediately closed, in CPython implementation. *So all three are
> equivalent on that platform.
>
> But if you're not sure the code will run on CPython, then you have to
> have something that explicitly catches the out-of-scopeness of the file
> object. *Both your (2) and (3) do that, with different syntaxes.
>
> DaveA


Stop being lazy and close the file. You don't want open file objects
just floating around in memory. Even the docs says something like
"yes, python will free the memory associated with a file object but
you can never *really* be sure *when* this will happen, so just
explicitly close the damn thing!". Besides, you can't guarantee that
any data has been written without calling f.flush() or f.close()
first. What if your program crashes and no data is written? huh?

I guess i could put my pants on by jumping into both legs at the same
time thereby saving one step, but i my fall down and break my arm. I
would much rather just use the one leg at a time approach...
 
Reply With Quote
 
Dennis Lee Bieber
Guest
Posts: n/a
 
      09-05-2009
On Sat, 5 Sep 2009 16:14:02 +0000 (UTC), kj <(E-Mail Removed)>
declaimed the following in gmane.comp.python.general:

> ...or alternatively, if the with-statement is available:
>
> # (3)
> def spam(filename):
> with file(filename) as fh:
> for line in fh:
> do_something_with(line)
>

<snip>
> Finally, I was under the impression that Python closed filehandles
> automatically when they were garbage-collected. (In fact (3)
> suggests as much, since it does not include an implicit call to
> fh.close.) If so, the difference between (1) and (3) does not seem
> very big. What am I missing here?


In the case of the with construct, in effect the with statement IS
equivalent to:

fh = file(filename)
for line in fh:
do something...
fh.close()

with the proper safeguards to ensure the close() is called. Essentially,
anything "opened" by a with clause is "closed" when the block is left.
--
Wulfraed Dennis Lee Bieber KD6MOG
http://www.velocityreviews.com/forums/(E-Mail Removed) HTTP://wlfraed.home.netcom.com/

 
Reply With Quote
 
Tim Chase
Guest
Posts: n/a
 
      09-05-2009
> CPython uses reference counting, so an object is garbage collected as
> soon as there are no references to it, but that's just an implementation
> detail.
>
> Other implementations, such as Jython and IronPython, don't use
> reference counting, so you don't know when an object will be garbage
> collected, which means that the file might remain open for an unknown
> time afterwards in case 1 above.
>
> Most people use CPython, so it's not surprising that case 1 is so
> common.


Additionally, many scripts just use a small number of files (say,
1-5 files) so having a file-handle open for the duration of the
run it minimal overhead.

On the other hand, when processing thousands of files, I always
explicitly close each file to make sure I don't exhaust some
file-handle limit the OS or interpreter may enforce.

-tkc




 
Reply With Quote
 
r
Guest
Posts: n/a
 
      09-05-2009
On Sep 5, 2:47*pm, Dennis Lee Bieber <(E-Mail Removed)> wrote:
(snip)
> > Finally, I was under the impression that Python closed filehandles
> > automatically when they were garbage-collected. *(In fact (3)
> > suggests as much, since it does not include an implicit call to
> > fh.close.) If so, the difference between (1) and (3) does not seem
> > very big. *What am I missing here?


True, but i find the with statement (while quite useful in general
practice) is not a "cure all" for situations that need and exception
caught. In that case the laborious finger wrecking syntax of "f.close
()" must be painstaking typed letter by painful letter.

f-.-c-l-o-s-e-(-)

It's just not fair ;-(
 
Reply With Quote
 
Steven D'Aprano
Guest
Posts: n/a
 
      09-05-2009
On Sat, 05 Sep 2009 16:14:02 +0000, kj wrote:

> Finally, I was under the impression that Python closed filehandles
> automatically when they were garbage-collected. (In fact (3) suggests
> as much, since it does not include an implicit call to fh.close.) If so,
> the difference between (1) and (3) does not seem very big. What am I
> missing here?


(1) Python the language will close file handles, but doesn't guarantee
when. Some implementations (e.g. CPython) will close them immediately the
file object goes out of scope. Others (e.g. Jython) will close them
"eventually", which may be when the program exists.

(2) If the file object never goes out of scope, say because you've stored
a reference to it somewhere, the file will never be closed and you will
leak file handles. Since the OS only provides a finite number of them,
any program which uses large number of files is at risk of running out.

(3) For quick and dirty scripts, or programs that only use one or two
files, relying on the VM to close the file is sufficient (although lazy
in my opinion *wink*) but for long-running applications using many files,
or for debugging, you may want more control over what happens when.


--
Steven
 
Reply With Quote
 
kj
Guest
Posts: n/a
 
      09-06-2009
In <02b2e6ca$0$17565$(E-Mail Removed)> Steven D'Aprano <(E-Mail Removed)> writes:

>(3) For quick and dirty scripts, or programs that only use one or two
>files, relying on the VM to close the file is sufficient (although lazy
>in my opinion *wink*)


It's not a matter of laziness or industriousness, but rather of
code readability. The real problem here is not the close() per
se, but rather all the additional machinery required to ensure that
the close happens. When the code is working with multiple file
handles simultaneously, one ends up with a thicket of try/finally's
that makes the code just *nasty* to look at. E.g., even with only
two files, namely an input and an output file, compare:

def nice(from_, to_):
to_h = file(to_, "w")
for line in file(from_):
print >> to_h, munge(line)

def nasty(from_, to_):
to_h = file(to_, "w")
try:
from_h = file(from_)
try:
for line in from_h:
print >> to_h, munge(line)
finally:
from_h.close()
finally:
to_h.close()

I leave to your imagination the joys of reading the code for
hairy(from_, to_, log_), where log_ is a third file to collect
warning messages.

kynn
 
Reply With Quote
 
Steven D'Aprano
Guest
Posts: n/a
 
      09-06-2009
On Sun, 06 Sep 2009 01:51:50 +0000, kj wrote:

> In <02b2e6ca$0$17565$(E-Mail Removed)> Steven D'Aprano
> <(E-Mail Removed)> writes:
>
>>(3) For quick and dirty scripts, or programs that only use one or two
>>files, relying on the VM to close the file is sufficient (although lazy
>>in my opinion *wink*)

>
> It's not a matter of laziness or industriousness, but rather of code
> readability. The real problem here is not the close() per se, but
> rather all the additional machinery required to ensure that the close
> happens. When the code is working with multiple file handles
> simultaneously, one ends up with a thicket of try/finally's that makes
> the code just *nasty* to look at.


Yep, that's because dealing with the myriad of things that *might* (but
probably won't) go wrong when dealing with files is *horrible*. Real
world code is almost always much nastier than the nice elegant algorithms
we hope for.

Most people know they have to deal with errors when opening files. The
best programmers deal with errors when writing to files. But only a few
of the most pedantic coders even attempt to deal with errors when
*closing* the file. Yes, closing the file can fail. What are you going to
do about it? At the least, you should notify the user, then continue.
Dying with an uncaught exception in the middle of processing millions of
records is Not Cool. But close failures are so rare that we just hope
we'll never experience one.

It really boils down to this... do you want to write correct code, or
elegant code?



--
Steven
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Explicitly calling constructors Prasoon C++ 17 07-05-2009 03:26 PM
Explicitly calling constructor Kavya C++ 3 10-29-2006 01:04 PM
Explicitly calling destructor engaarea@gmail.com C++ 7 09-08-2006 02:30 PM
Calling container constructor explicitly daveb C++ 6 07-17-2006 08:58 AM
explicitly set metric for one static route Daniel Eyholzer Cisco 10 12-08-2004 03:49 AM



Advertisments