Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Re: Newbie help for using multiprocessing and subprocess packages forcreating child processes

Reply
Thread Tools

Re: Newbie help for using multiprocessing and subprocess packages forcreating child processes

 
 
Matt
Guest
Posts: n/a
 
      06-16-2009
Try replacing:
cmd = [ "ls /path/to/file/"+staname+"_info.pf" ]
with:
cmd = [ ls, /path/to/file/"+staname+"_info.pf" ]

Basically, the first is the conceptual equivalent of executing the
following in BASH:
ls /path/to/file/FOO_info.pf
The second is this:
ls /path/to/file/FOO_info.pf

The first searches for a command in your PATH named ls /path.... The
second searches for a command names ls and gives it the argument
/path...

Also, I think this is cleaner (but its up to personal preference):
cmd = [ "ls", "/path/to/file/%s_info.pf" % staname]

________________________
~Matthew Strax-Haber
Northeastern University, CCIS & CBA
Co-op, NASA Langley Research Center
Student Government Association, Special Interest Senator
Resident Student Association, SGA Rep & General Councilor
Chess Club, Treasurer
E-mail: strax-haber.m=AT=neu.edu

On Tue, Jun 16, 2009 at 3:13 PM, Rob Newman<(E-Mail Removed)> wrote:
> Hi All,
>
> I am new to Python, and have a very specific task to accomplish. I have a
> command line shell script that takes two arguments:
>
> create_graphs.sh -v --sta=STANAME
>
> where STANAME is a string 4 characters long.
>
> create_graphs creates a series of graphs using Matlab (among other 3rd party
> packages).
>
> Right now I can run this happily by hand, but I have to manually execute the
> command for each STANAME. What I want is to have a Python script that I pass
> a list of STANAMEs to, and it acts like a daemon and spawns as many child
> processes as there are processors on my server (64), until it goes through
> all the STANAMES (about 200).
>
> I posted a message on Stack Overflow (ref:
> http://stackoverflow.com/questions/8...m-use-multipro)*and
> was recommended to use the multiprocessing and subprocess packages. In the
> Stack Overflow answers, it was suggested that I use the process pool class
> in multiprocessing. However, the server I have to use is a Sun Sparc (T5220,
> Sun OS 5.10) and there is a known issue with sem_open() (ref:
> http://bugs.python.org/issue3770), so it appears I cannot use the process
> pool class.
>
> So, below is my script (controller.py) that I have attempted to use as a
> test, that just calls the 'ls' command on a file I know exists rather than
> firing off my shell script (which takes ~ 10 mins to run per STANAME):
>
> #!/path/to/python
>
> import sys
> import os
> import json
> import multiprocessing
> import subprocess
>
> def work(verbose,staname):
> *print 'function:',staname
> *print 'parent process:', os.getppid()
> *print 'process id:', os.getpid()
> *print "ls /path/to/file/"+staname+"_info.pf"
> *# cmd will eventually get replaced with the shell script with the verbose
> and staname options
> *cmd = [ "ls /path/to/file/"+staname+"_info.pf" ]
> *return subprocess.call(cmd, shell=False)
>
> if __name__ == '__main__':
>
> *report_sta_list = ['B10A','B11A','BNLO']
>
> *# Print out the complete station list for testing
> *print report_sta_list
>
> *# Get the number of processors available
> *num_processes = multiprocessing.cpu_count()
>
> *print 'Number of processes: %s' % (num_processes)
>
> *print 'Now trying to assign all the processors'
>
> *threads = []
>
> *len_stas = len(report_sta_list)
>
> *print "+++ Number of stations to process: %s" % (len_stas)
>
> *# run until all the threads are done, and there is no data left
> *while len(threads) < len(report_sta_list):
>
> * *# if we aren't using all the processors AND there is still data left to
> * *# compute, then spawn another thread
>
> * *print "+++ Starting to set off all child processes"
>
> * *if( len(threads) < num_processes ):
>
> * * *this_sta = report_sta_list.pop()
>
> * * *print "+++ Station is %s" % (this_sta)
>
> * * *p = multiprocessing.Process(target=work,args=['v',this_sta])
>
> * * *p.start()
>
> * * *print p, p.is_alive()
>
> * * *threads.append(p)
>
> * *else:
>
> * * *for thread in threads:
>
> * * * *if not thread.is_alive():
>
> * * * * *threads.remove(thread)
>
> However, I seem to be running into a whole series of errors:
>
> myhost{rt}62% controller.py
> ['B10A', 'B11A', 'BNLO']
> Number of processes: 64
> Now trying to assign all the processors
> +++ Number of stations to process: 3
> +++ Starting to set off all child processes
> +++ Station is BNLO
> <Process(Process-1, started)> True
> +++ Starting to set off all child processes
> +++ Station is B11A
> function: BNLO
> parent process: 22341
> process id: 22354
> ls /path/to/file/BNLO_info.pf
> <Process(Process-2, started)> True
> function: B11A
> parent process: 22341
> process id: 22355
> ls /path/to/file/B11A_info.pf
> Process Process-1:
> Traceback (most recent call last):
> *File "/opt/csw/lib/python/multiprocessing/process.py", line 231, in
> _bootstrap
> * *self.run()
> *File "/opt/csw/lib/python/multiprocessing/process.py", line 88, in run
> * *self._target(*self._args, **self._kwargs)
> *File "controller.py", line 104, in work
> * *return subprocess.call(cmd, shell=False)
> *File "/opt/csw/lib/python/subprocess.py", line 444, in call
> * *return Popen(*popenargs, **kwargs).wait()
> *File "/opt/csw/lib/python/subprocess.py", line 595, in __init__
> * *errread, errwrite)
> *File "/opt/csw/lib/python/subprocess.py", line 1092, in _execute_child
> * *raise child_exception
> OSError: [Errno 2] No such file or directory
> Process Process-2:
> Traceback (most recent call last):
> *File "/opt/csw/lib/python/multiprocessing/process.py", line 231, in
> _bootstrap
> * *self.run()
> *File "/opt/csw/lib/python/multiprocessing/process.py", line 88, in run
> * *self._target(*self._args, **self._kwargs)
> *File "controller.py", line 104, in work
> * *return subprocess.call(cmd, shell=False)
> *File "/opt/csw/lib/python/subprocess.py", line 444, in call
> * *return Popen(*popenargs, **kwargs).wait()
> *File "/opt/csw/lib/python/subprocess.py", line 595, in __init__
> * *errread, errwrite)
> *File "/opt/csw/lib/python/subprocess.py", line 1092, in _execute_child
> * *raise child_exception
> OSError: [Errno 2] No such file or directory
>
> The files are there:
>
> mhost{me}11% ls -la /path/to/files/BNLO_info.pf
> -rw-rw-r-- * 1 me * * * group * * 391 May 19 22:40
> /path/to/files/BNLO_info.pf
> myhost{me}12% ls -la /path/to/file/B11A_info.pf
> -rw-rw-r-- * 1 me * * * group * * 391 May 19 22:27
> /path/to/files/B11A_info.pf
>
> I might be doing this completely wrong, but I thought this would be the way
> to list the files dynamically. Admittedly this is just a stepping stone to
> running the actual shell script I want to run. Can anyone point me in the
> right direction or offer any advice for using these packages?
>
> Thanks in advance for any help or insight.
> - Rob
> --
> http://mail.python.org/mailman/listinfo/python-list
>

 
Reply With Quote
 
 
 
 
Piet van Oostrum
Guest
Posts: n/a
 
      06-16-2009
>>>>> Matt <(E-Mail Removed)> (M) wrote:

>M> Try replacing:
>M> cmd = [ "ls /path/to/file/"+staname+"_info.pf" ]
>M> with:
>M> cmd = [ “ls”, “/path/to/file/"+staname+"_info.pf" ]


In addition I would like to remark that -- if the only thing you want to
do is to start up a new command with subprocess.Popen -- the use of the
multiprocessing package is overkill. You could use threads as well.

Moreover, if you don't expect any output from these processes and don't
supply input to them through pipes there isn't even a need for these
threads. You could just use os.wait() to wait for a child to finish and
then start a new process if necessary.
--
Piet van Oostrum <(E-Mail Removed)>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
 
 
 
Mike Kazantsev
Guest
Posts: n/a
 
      06-17-2009
On Tue, 16 Jun 2009 23:20:05 +0200
Piet van Oostrum <(E-Mail Removed)> wrote:

> >>>>> Matt <(E-Mail Removed)> (M) wrote:

>
> >M> Try replacing:
> >M> cmd = [ "ls /path/to/file/"+staname+"_info.pf" ]
> >M> with:
> >M> cmd = [ “ls”, “/path/to/file/"+staname+"_info.pf" ]

>
> In addition I would like to remark that -- if the only thing you want
> to do is to start up a new command with subprocess.Popen -- the use
> of the multiprocessing package is overkill. You could use threads as
> well.
>
> Moreover, if you don't expect any output from these processes and
> don't supply input to them through pipes there isn't even a need for
> these threads. You could just use os.wait() to wait for a child to
> finish and then start a new process if necessary.


And even if there is need to read/write data from/to the pipes more
than once (aka communicate), using threads or any more python
subprocesses seem like hammering a nail with sledgehammer - just _read_
or _write_ to pipes asynchronously.

--
Mike Kazantsev // fraggod.net

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: open source software or freeware which can run on Windows OS forcreating CD/DVD labels sandy58 Computer Support 1 03-30-2010 11:04 PM
Persistent variable in subprocess using multiprocessing? mheavner Python 10 07-20-2009 12:41 PM
Newbie help for using multiprocessing and subprocess packages forcreating child processes Rob Newman Python 0 06-16-2009 07:13 PM
How do I: Main thread spawn child threads, which child processes...control those child processes? Jeff Rodriguez C Programming 23 12-09-2003 11:06 PM



Advertisments