Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > popen2 with large input

Reply
Thread Tools

popen2 with large input

 
 
cherico
Guest
Posts: n/a
 
      01-29-2004
from popen2 import popen2

r, w = popen2 ( 'tr "[A-Z]" "[a-z]"' )
w.write ( t ) # t is a text file of around 30k bytes
w.close ()
text = r.readlines ()
print text
r.close ()

This simple script halted on

w.write ( t )

Anyone knows what the problem is?
 
Reply With Quote
 
 
 
 
Eric Brunel
Guest
Posts: n/a
 
      01-29-2004
cherico wrote:
> from popen2 import popen2
>
> r, w = popen2 ( 'tr "[A-Z]" "[a-z]"' )
> w.write ( t ) # t is a text file of around 30k bytes
> w.close ()
> text = r.readlines ()
> print text
> r.close ()
>
> This simple script halted on
>
> w.write ( t )
>
> Anyone knows what the problem is?


Yep: deadlock... Pipes are synchronized: you can't read from (resp. write to) a
pipe if the process at the other end does not write to (resp. read from) it. If
you try the command "tr '[A-Z]' '[a-z]'" interactively, you'll see that
everytime tr receives a line, it outputs *immediately* the converted line. So if
you write a file having several lines to the pipe, on the first \n, tr will try
to write to its output, and will be stuck since your program is not reading from
it. So it won't read on its input anymore, so your program will be stuck because
it can't write to the pipe. And they'll wait for each other until the end of
times...

If you really want to use the "tr" command for this stuff, you'd better send
your text lines by lines and read the result immediatly, like in:

text = ''
for line in text.splitlines(1):
w.write(line)
w.flush() # Mandatory because of output bufferization - see below
text += r.readline()
w.close()
r.close()

It *may* work better, but you cannot be sure: in fact, you just can't know
exactly when tr will actually output the converted text. Even worse: since
output is usually buffered, you'll only see the output from tr when its standard
output is flushed, and you can't know when that will be...

(BTW, the script above does not work on my Linux box: the first r.readline()
never returns...)

So the conclusion is: don't use pipes unless you're really forced to. They're a
hell to use, since you never know how to synchronize them.

BTW, if the problem you posted is your real problem, why on earth don't you do:
text = t.lower()
???

HTH
--
- Eric Brunel <eric dot brunel at pragmadev dot com> -
PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com

 
Reply With Quote
 
 
 
 
Jeff Epler
Guest
Posts: n/a
 
      01-29-2004
The connection to the child process created by the popen family have
some inherent maximum size for data "in flight". I'm not sure how to
find out what that value is, but it might be anywhere from a few bytes
to a few K.

So tr starts to write its output as it gets input, but you won't read
its output before you've written all your output. If the size of tr's
output is bigger than the size of the buffer for tr's unread output,
you'll deadlock.

As an aside, the particular problem you pose can be solved with Python's
str.translate method. If the actual goal is to "work like tr", then use
that instead and forget about popen.

Anyway, to solve the popen2 problem, you'll need to write something like this:
[untested, and as you can see there's lots of pseudocode]
def getoutput( command, input ):
r, w = popen2(command)
rr = [r]; ww = [w]
output = []
set r and w nonblocking
while 1:
_r, _w, _ = select.select(rr, ww, [], 0)

if _w:
write some stuff from input to w
if nothing left:
w.close(); ww = []
if _r:
read some stuff into output
if nothing to read:
handle the fact that r was closed
if w was closed: break
else: probably an error condition
return "".join(output)

You could also write 'input' into a temporary file and use
commands.getoutput() or os.popen(.., "r").

Jeff

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Bug in popen2.Popen3? Jeffrey Barish Python 1 06-17-2004 04:42 PM
popen2.Popen3 process destruction Python 1 06-15-2004 10:35 PM
Possible problem with popen2 module A. Lloyd Flanagan Python 2 05-03-2004 02:17 PM
popen2 trouble Diez B. Roggisch Python 2 04-05-2004 12:05 PM
popen2 Guy Python 1 08-12-2003 04:57 PM



Advertisments