Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > How do subprocess.Popen("ls | grep foo", shell=True) withshell=False?

Reply
Thread Tools

How do subprocess.Popen("ls | grep foo", shell=True) withshell=False?

 
 
Chris Seberino
Guest
Posts: n/a
 
      06-10-2010
How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?

Does complex commands with "|" in them mandate shell=True?

cs
 
Reply With Quote
 
 
 
 
Chris Rebert
Guest
Posts: n/a
 
      06-10-2010
On Wed, Jun 9, 2010 at 9:15 PM, Chris Seberino <> wrote:
> How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?


I would think:

from subprocess import Popen, PIPE
ls = Popen("ls", stdout=PIPE)
grep = Popen(["grep", "foo"], stdin=ls.stdout)

Cheers,
Chris
--
http://blog.rebertia.com
 
Reply With Quote
 
 
 
 
Nobody
Guest
Posts: n/a
 
      06-10-2010
On Wed, 09 Jun 2010 21:15:48 -0700, Chris Seberino wrote:

> How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?


The same way that the shell does it, e.g.:

from subprocess import Popen, PIPE
p1 = Popen("ls", stdout=PIPE)
p2 = Popen(["grep", "foo"], stdin=p1.stdout, stdout = PIPE)
p1.stdout.close()
result = p2.communicate()[0]
p1.wait()

Notes:

Without the p1.stdout.close(), if the reader (grep) terminates before
consuming all of its input, the writer (ls) won't terminate so long as
Python retains the descriptor corresponding to p1.stdout. In this
situation, the p1.wait() will deadlock.

The communicate() method wait()s for the process to terminate. Other
processes need to be wait()ed on explicitly, otherwise you end up with
"zombies" (labelled "<defunct>" in the output from "ps").

> Does complex commands with "|" in them mandate shell=True?


No.

Also, "ls | grep" may provide a useful tutorial for the subprocess module,
but if you actually need to enumerate files, use e.g. os.listdir/os.walk()
and re.search/fnmatch, or glob. Spawning child processes to perform tasks
which can easily be performed in Python is inefficient (and often creates
unnecessary portability issues).

 
Reply With Quote
 
Grant Edwards
Guest
Posts: n/a
 
      06-10-2010
On 2010-06-10, Chris Seberino <> wrote:

> How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?


You'll have to build your own pipeline with multiple calls to subprocess

> Does complex commands with "|" in them mandate shell=True?


Yes.

Hey, I've got a novel idea!

Read the documentation for the subprocess module:

http://docs.python.org/library/subpr...shell-pipeline

--
Grant Edwards grant.b.edwards Yow! ... My pants just went
at on a wild rampage through a
gmail.com Long Island Bowling Alley!!
 
Reply With Quote
 
Chris Seberino
Guest
Posts: n/a
 
      06-10-2010
On Jun 10, 6:52*am, Nobody <nob...@nowhere.com> wrote:
> Without the p1.stdout.close(), if the reader (grep) terminates before
> consuming all of its input, the writer (ls) won't terminate so long as
> Python retains the descriptor corresponding to p1.stdout. In this
> situation, the p1.wait() will deadlock.
>
> The communicate() method wait()s for the process to terminate. Other
> processes need to be wait()ed on explicitly, otherwise you end up with
> "zombies" (labelled "<defunct>" in the output from "ps").


You are obviously very wise on such things. I'm curious if this
deadlock issue is a rare event since I'm grep (hopefully) would rarely
terminate before consuming all its input.

Even if zombies are created, they will eventually get dealt with my OS
w/o any user intervention needed right?

I'm just trying to verify the naive solution of not worrying about
these deadlock will still be ok and handled adequately by os.

cs
 
Reply With Quote
 
Lie Ryan
Guest
Posts: n/a
 
      06-10-2010
On 06/10/10 21:52, Nobody wrote:
> Spawning child processes to perform tasks
> which can easily be performed in Python is inefficient


Not necessarily so, recently I wrote a script which takes a blink of an
eye when I pipe through cat/grep to prefilter the lines before doing
further complex filtering in python; however when I eliminated the
cat/grep subprocess and rewrite it in pure python, what was done in a
blink of an eye turns into ~8 seconds (not much to fetter around, but it
shows that using subprocess can be faster). I eventually optimized a
couple of things and reduced it to ~1.5 seconds, up to which, I stopped
since to go even faster would require reading by larger chunks,
something which I don't really want to do.

The task was to take a directory of ~10 files, each containing thousands
of short lines (~5-10 chars per line on average) and count the number of
lines which match a certain criteria, a very typical script job, however
the overhead of reading the files line-by-line in pure python can be
straining (you can read in larger chunks, but that's not the point,
eliminating grep may not come for free).
 
Reply With Quote
 
Nobody
Guest
Posts: n/a
 
      06-12-2010
On Thu, 10 Jun 2010 08:40:03 -0700, Chris Seberino wrote:

> On Jun 10, 6:52Â*am, Nobody <nob...@nowhere.com> wrote:
>> Without the p1.stdout.close(), if the reader (grep) terminates before
>> consuming all of its input, the writer (ls) won't terminate so long as
>> Python retains the descriptor corresponding to p1.stdout. In this
>> situation, the p1.wait() will deadlock.
>>
>> The communicate() method wait()s for the process to terminate. Other
>> processes need to be wait()ed on explicitly, otherwise you end up with
>> "zombies" (labelled "<defunct>" in the output from "ps").

>
> You are obviously very wise on such things. I'm curious if this
> deadlock issue is a rare event since I'm grep (hopefully) would rarely
> terminate before consuming all its input.


That depends; it might never start (missing grep, missing shared
library), segfault, terminate due to a signal, etc. Also, the program
might later be modified to use "grep -m <count> ..." which will terminate
after finding <count> matches.

> Even if zombies are created, they will eventually get dealt with my OS
> w/o any user intervention needed right?


They will persist until the parent either wait()s for them (I think that
this will happen if the process gets garbage-collected) or terminates. For
short-lived processes, you can forget about them; for long-lived
processes, they need to be dealt with.

> I'm just trying to verify the naive solution of not worrying about
> these deadlock will still be ok and handled adequately by os.


Deadlock is deadlock. If you wait() on the child while it's blocked
waiting for your Python program to consume its output, the wait() will
block forever.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
Grep Hans Bijvoet Java 5 11-20-2004 01:52 AM
perl vs Unix grep Al Belden Perl 1 07-07-2004 05:58 AM
s/// has apparent side effect on grep() John E. Jardine Perl 2 04-13-2004 08:45 PM
Pattern matching help! grep emails from file! danpres2k Perl 3 08-25-2003 02:47 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57