Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Speed ain't bad

Reply
Thread Tools

Speed ain't bad

 
 
Anders J. Munch
Guest
Posts: n/a
 
      01-01-2005
"Bulba!" <(E-Mail Removed)> wrote:
>
> One of the posters inspired me to do profiling on my newbie script
> (pasted below). After measurements I have found that the speed
> of Python, at least in the area where my script works, is surprisingly
> high.


Pretty good code for someone who calls himself a newbie.

One line that puzzles me:
> sfile=open(sfpath,'rb')


You never use sfile again.
In any case, you should explicitly close all files that you open. Even
if there's an exception:

sfile = open(sfpath, 'rb')
try:
<stuff to do with the file open>
finally:
sfile.close()

>
> The only thing I'm missing in this picture is knowledge if my script
> could be further optimised (not that I actually need better
> performance, I'm just curious what possible solutions could be).
>
> Any takers among the experienced guys?


Basically the way to optimise these things is to cut down on anything
that does I/O: Use as few calls to os.path.is{dir,file}, os.stat, open
and such that you can get away with.

One way to do that is caching; e.g. storing names of known directories
in a set (sets.Set()) and checking that set before calling
os.path.isdir. I haven't spotted any obvious opportunities for that
in your script, though.

Another way is the strategy of "it's easier to ask forgiveness than to
ask permission".
If you replace:
if(not os.path.isdir(zfdir)):
os.makedirs(zfdir)
with:
try:
os.makedirs(zfdir)
except EnvironmentError:
pass

then not only will your script become a micron more robust, but
assuming zfdir typically does not exist, you will have saved the call
to os.path.isdir.

- Anders


 
Reply With Quote
 
 
 
 
Bulba!
Guest
Posts: n/a
 
      01-02-2005
On Sat, 1 Jan 2005 14:20:06 +0100, "Anders J. Munch"
<(E-Mail Removed)> wrote:

>> One of the posters inspired me to do profiling on my newbie script
>> (pasted below). After measurements I have found that the speed
>> of Python, at least in the area where my script works, is surprisingly
>> high.

>
>Pretty good code for someone who calls himself a newbie.


<blush>

>One line that puzzles me:
>> sfile=open(sfpath,'rb')


>You never use sfile again.


Right! It's a leftover from a previous implementation (that
used bzip2). Forgot to delete it, thanks.

>Another way is the strategy of "it's easier to ask forgiveness than to
>ask permission".
>If you replace:
> if(not os.path.isdir(zfdir)):
> os.makedirs(zfdir)
>with:
> try:
> os.makedirs(zfdir)
> except EnvironmentError:
> pass


>then not only will your script become a micron more robust, but
>assuming zfdir typically does not exist, you will have saved the call
>to os.path.isdir.


Yes, this is the kind of habit that low-level languages like C
missing features like exceptions ingrain in a mind of a programmer...

Getting out of this straitjacket is kind of hard - it would not cross
my mind to try smth like what you showed me, thanks!

Exceptions in Python are a GODSEND. I strongly recommend
to any former C programmer wanting to get rid of a "straightjacket"
to read the following to get an idea how not to write C code in Python
and instead exploit the better side of VHLL:

http://gnosis.cx/TPiP/appendix_a.txt




--
It's a man's life in a Python Programming Association.
 
Reply With Quote
 
 
 
 
Jeff Shannon
Guest
Posts: n/a
 
      01-03-2005
Anders J. Munch wrote:

> Another way is the strategy of "it's easier to ask forgiveness than to
> ask permission".
> If you replace:
> if(not os.path.isdir(zfdir)):
> os.makedirs(zfdir)
> with:
> try:
> os.makedirs(zfdir)
> except EnvironmentError:
> pass
>
> then not only will your script become a micron more robust, but
> assuming zfdir typically does not exist, you will have saved the call
> to os.path.isdir.


.... at the cost of an exception frame setup and an incomplete call to
os.makedirs(). It's an open question whether the exception setup and
recovery take less time than the call to isdir(), though I'd expect
probably not. The exception route definitely makes more sense if the
makedirs() call is likely to succeed; if it's likely to fail, then
things are murkier.

Since isdir() *is* a disk i/o operation, then in this case the
exception route is probably preferable anyhow. In either case, one
must touch the disk; in the exception case, there will only ever be
one disk access (which either succeeds or fails), while in the other
case, there may be two disk accesses. However, if it wasn't for the
extra disk i/o operation, then the 'if ...' might be slightly faster,
even though the exception-based route is more Pythonic.

Jeff Shannon
Technician/Programmer
Credit International

 
Reply With Quote
 
John Machin
Guest
Posts: n/a
 
      01-03-2005
Anders J. Munch wrote:
> Another way is the strategy of "it's easier to ask forgiveness than

to
> ask permission".
> If you replace:
> if(not os.path.isdir(zfdir)):
> os.makedirs(zfdir)
> with:
> try:
> os.makedirs(zfdir)
> except EnvironmentError:
> pass
>
> then not only will your script become a micron more robust, but
> assuming zfdir typically does not exist, you will have saved the call
> to os.path.isdir.


1. Robustness: Both versions will "crash" (in the sense of an unhandled
exception) in the situation where zfdir exists but is not a directory.
The revised version just crashes later than the OP's version
Trapping EnvironmentError seems not very useful -- the result will not
distinguish (on Windows 2000 at least) between the 'existing dir' and
'existing non-directory' cases.


Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on
win32
>>> import os, os.path
>>> os.path.exists('fubar_not_dir')

True
>>> os.path.isdir('fubar_not_dir')

False
>>> os.makedirs('fubar_not_dir')

Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "c:\Python24\lib\os.py", line 159, in makedirs
mkdir(name, mode)
OSError: [Errno 17] File exists: 'fubar_not_dir'
>>> try:

.... os.mkdir('fubar_not_dir')
.... except EnvironmentError:
.... print 'trapped env err'
....
trapped env err
>>> os.mkdir('fubar_is_dir')
>>> os.mkdir('fubar_is_dir')

Traceback (most recent call last):
File "<stdin>", line 1, in ?
OSError: [Errno 17] File exists: 'fubar_is_dir'
>>>


2. Efficiency: I don't see the disk I/O inefficiency in calling
os.path.isdir() before os.makedirs() -- if the relevant part of the
filesystem wasn't already in memory, the isdir() call would make it so,
and makedirs() would get a free ride, yes/no?

 
Reply With Quote
 
Anders J. Munch
Guest
Posts: n/a
 
      01-04-2005
"John Machin" <(E-Mail Removed)> wrote:
> 1. Robustness: Both versions will "crash" (in the sense of an unhandled
> 2. Efficiency: I don't see the disk I/O inefficiency in calling


3. Don't itemise perceived flaws in other people's postings. It may
give off a hostile impression.

> 1. Robustness: Both versions will "crash" (in the sense of an unhandled
> exception) in the situation where zfdir exists but is not a directory.
> The revised version just crashes later than the OP's version
> Trapping EnvironmentError seems not very useful -- the result will not
> distinguish (on Windows 2000 at least) between the 'existing dir' and
> 'existing non-directory' cases.


Good point; my version has room for improvement. But at least it fixes
the race condition between isdir and makedirs.

What I like about EnvironmentError is that it it's easier to use than
figuring out which one of IOError or OSError applies (and whether that
can be relied on, cross-platform).

> 2. Efficiency: I don't see the disk I/O inefficiency in calling
> os.path.isdir() before os.makedirs() -- if the relevant part of the
> filesystem wasn't already in memory, the isdir() call would make it
> so, and makedirs() would get a free ride, yes/no?


Perhaps. Looking stuff up in operating system tables and buffers takes
time too. And then there's network latency; how much local caching do
you get for an NFS mount or SMB share?

If you really want to know, measure.

- Anders


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
integer >= 1 == True and integer.0 == False is bad, bad, bad!!! rantingrick Python 44 07-13-2010 06:33 PM
Bad media, bad files or bad Nero? John Computer Information 23 01-08-2008 09:17 PM
ActiveX apologetic Larry Seltzer... "Sun paid for malicious ActiveX code, and Firefox is bad, bad bad baad. please use ActiveX, it's secure and nice!" (ok, the last part is irony on my part) fernando.cassia@gmail.com Java 0 04-16-2005 10:05 PM
24 Season 3 Bad Bad Bad (Spoiler) nospam@nospam.com DVD Video 12 02-23-2005 03:28 AM
24 Season 3 Bad Bad Bad (Spoiler) nospam@nospam.com DVD Video 0 02-19-2005 01:10 AM



Advertisments