Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Tuning a select() loop for os.popen3()

Reply
Thread Tools

Tuning a select() loop for os.popen3()

 
 
Christopher DeMarco
Guest
Posts: n/a
 
      12-30-2005
Hi all...

I've written a class to provide an interface to popen; I've included
the actual select() loop below. I'm finding that "sometimes" popen'd
processes take "a really long time" to complete and "other times" I
get incomplete stdout.

E.g:

- on boxA ffmpeg returns in ~25s; on boxB (comparable hardware,
identical OS) ~5m.

- ``ls'' on a directory with 15 nodes returns full stdout; ``ls -R''
on that same directory (with ~32K nodes beneath) stops after
4097KB of output.

The code in question is running on Linux 2.6.x; no cross-platform
portability desired. popen'd commands will never be interactive; I
just wanna read stdin/stdout and perhaps feed a one-shot string via
stdin.

Here's the relevent code (stripped of comments and various OO
setup/output stuff):


# # ## ### ##### ######## ############# #####################
# cut here

def run(self):
import os, select, syslog
(_stdin, _stdout, _stderr) = os.popen3(self.command)

stdoutChunks = []; stderrChunks = []
readList = [_stdout, _stderr];
if self.stdinString is not "": writeList = [_stdin]
else: writeList = []
readStderr = False; readStdout = False

i = 0
while True:
i += 1
(r, w, x) = select.select(readList, writeList, [], 1)
read = ""

if self.stdinString is not "":
if w:
bytesWritten = os.write(_stdin.fileno(), self.stdinString)
writeList.remove(_stdin)
_stdin.close()
continue

if r:
if _stderr in r:
readStderr = True
read = os.read(_stderr.fileno(), 16384)
if read: stderrChunks.append(read)
else: readList.remove(_stderr)
continue

elif _stdout in r:
readStdout = True
read = os.read(_stdout.fileno(), 16384)
if read:
stdoutChunks.append(read)
syslog.syslog("Command instance read %d from stdout" % len(read))
else: readList.remove(_stdout)
continue

else:
if \
(readStderr and self.dieOnStderr) \
or \
readStdout:
syslog.syslog("Command instance finished")
break
return

# cut here
# # ## ### ##### ######## ############# #####################


Tweaking (a) the os.read() buffer size and (b) the select() timeout
and testing with ``ls -R'' on a directory with ~ 32K nodes beneath, I
find the following trends:

1. With a very small os.read() buffer, I get full stdout, but running
time is rather long. Running time increases as select() timeout
increases.

2. With a very large os.read() buffer, I get incomplete stdout (but
running time is *very* fast). As select() timeout increases, I get
better and better results - with a select() timeout of 0.2 I seem to
get reliably full stdout.


The values used in the code I've pasted above - large buffer, large
select() timeout - seem to perform "well enough"; none of the
previously described problems manifest. However, ``ls -lR /'' (way
more than 32K nodes) "sometimes" gives incomplete stdout.


My first question, then, is paranoid: I've run all these benchmarks
because the application using this code saw a HUGE performance hit
when we started using popen'd commands which generated "lots of"
output.

Is there anything wrong with the logic in my code?!

Will I see severe performance degradation (or worse, incomplete
stdout/stderr) as system variables change (e.g. system load increases,
popen'd program changes, popen'd program increases workload, etc.)?


Next question - how do I tune the select() timeout and the os.read()
buffer correctly? Is it *really* per- command, per- system, per-
phase-of-moon voodoo? Is there a Reccommended Setup for such a
select() loop?


Thanks in advance, for insight as well as for tolerating my
long-windedness...


--
Christopher DeMarco <(E-Mail Removed)>
Alephant Systems (http://alephant.net)
PGP public key at http://pgp.alephant.net
+1-412-708-9660

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDtYoUm4cw+C52z1wRApzUAJ9Nw6bUIlxG8hph4Xixu4 fwmjB4ngCcC6JC
8ST4U1vgtFsQpqauooK9+Tw=
=qf+r
-----END PGP SIGNATURE-----

 
Reply With Quote
 
 
 
 
Donn Cave
Guest
Posts: n/a
 
      12-31-2005
In article <(E-Mail Removed)>,
Christopher DeMarco <(E-Mail Removed)> wrote:

> I've written a class to provide an interface to popen; I've included
> the actual select() loop below. I'm finding that "sometimes" popen'd
> processes take "a really long time" to complete and "other times" I
> get incomplete stdout.


....

> My first question, then, is paranoid: I've run all these benchmarks
> because the application using this code saw a HUGE performance hit
> when we started using popen'd commands which generated "lots of"
> output.
>
> Is there anything wrong with the logic in my code?!


I tried a modified version with 'ls -R .', which yields about
1 Mb of data, and saw no problems on MacOS X. Same data, and
about the same time as 'ls -R .' from the shell, maybe 5% longer.

But I modified it a lot. I removed every "continue", I removed
the "break", and I made readList the condition for the while loop.
With these changes, a 0.1 second timeout is about the same as no
timeout, but at 0.01 second I do see a little slow down. Still
no loss of data.

I suspect there is indeed something wrong with your logic, but
I'm not going to try to figure it out. If you're sure it's
right, I think you should post again with the actual code for
a program that demonstrates your problem(s). Your goal for the
revised logic should be 1) avoid gratuitous branches in the flow
of control, 2) reduce number of state variables that you have to
account for, and 3) express your intentions clearly with respect
to the timeouts -- what do you do when it times out, and why?

Donn Cave, http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Triple nested loop python (While loop insde of for loop inside ofwhile loop) Isaac Won Python 9 03-04-2013 10:08 AM
Can't listen audio/realone files while tuning MSN Tuning vizard working Thaqalain Computer Support 5 10-22-2005 09:06 PM
ANN: FireTuneUp v1.0 - tuning utility for Mozilla Firefox Max Firefox 2 05-21-2005 11:09 PM
chip tuning chip tuning Microsoft Certification 0 06-10-2004 12:11 AM
chip tuning chip tuning Microsoft Certification 0 06-09-2004 02:28 PM



Advertisments