Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   Re: Newbie help for using multiprocessing and subprocess packages forcreating child processes (http://www.velocityreviews.com/forums/t688069-re-newbie-help-for-using-multiprocessing-and-subprocess-packages-forcreating-child-processes.html)

Matt 06-16-2009 07:47 PM

Re: Newbie help for using multiprocessing and subprocess packages forcreating child processes
 
Try replacing:
cmd = [ "ls /path/to/file/"+staname+"_info.pf" ]
with:
cmd = [ “ls”, “/path/to/file/"+staname+"_info.pf" ]

Basically, the first is the conceptual equivalent of executing the
following in BASH:
‘ls /path/to/file/FOO_info.pf’
The second is this:
‘ls’ ‘/path/to/file/FOO_info.pf’

The first searches for a command in your PATH named ‘ls /path...’. The
second searches for a command names ‘ls’ and gives it the argument
‘/path...’

Also, I think this is cleaner (but it’s up to personal preference):
cmd = [ "ls", "/path/to/file/%s_info.pf" % staname]

________________________
~Matthew Strax-Haber
Northeastern University, CCIS & CBA
Co-op, NASA Langley Research Center
Student Government Association, Special Interest Senator
Resident Student Association, SGA Rep & General Councilor
Chess Club, Treasurer
E-mail: strax-haber.m=AT=neu.edu

On Tue, Jun 16, 2009 at 3:13 PM, Rob Newman<rlnewman@ucsd.edu> wrote:
> Hi All,
>
> I am new to Python, and have a very specific task to accomplish. I have a
> command line shell script that takes two arguments:
>
> create_graphs.sh -v --sta=STANAME
>
> where STANAME is a string 4 characters long.
>
> create_graphs creates a series of graphs using Matlab (among other 3rd party
> packages).
>
> Right now I can run this happily by hand, but I have to manually execute the
> command for each STANAME. What I want is to have a Python script that I pass
> a list of STANAMEs to, and it acts like a daemon and spawns as many child
> processes as there are processors on my server (64), until it goes through
> all the STANAMES (about 200).
>
> I posted a message on Stack Overflow (ref:
> http://stackoverflow.com/questions/8...m-use-multipro)*and
> was recommended to use the multiprocessing and subprocess packages. In the
> Stack Overflow answers, it was suggested that I use the process pool class
> in multiprocessing. However, the server I have to use is a Sun Sparc (T5220,
> Sun OS 5.10) and there is a known issue with sem_open() (ref:
> http://bugs.python.org/issue3770), so it appears I cannot use the process
> pool class.
>
> So, below is my script (controller.py) that I have attempted to use as a
> test, that just calls the 'ls' command on a file I know exists rather than
> firing off my shell script (which takes ~ 10 mins to run per STANAME):
>
> #!/path/to/python
>
> import sys
> import os
> import json
> import multiprocessing
> import subprocess
>
> def work(verbose,staname):
> *print 'function:',staname
> *print 'parent process:', os.getppid()
> *print 'process id:', os.getpid()
> *print "ls /path/to/file/"+staname+"_info.pf"
> *# cmd will eventually get replaced with the shell script with the verbose
> and staname options
> *cmd = [ "ls /path/to/file/"+staname+"_info.pf" ]
> *return subprocess.call(cmd, shell=False)
>
> if __name__ == '__main__':
>
> *report_sta_list = ['B10A','B11A','BNLO']
>
> *# Print out the complete station list for testing
> *print report_sta_list
>
> *# Get the number of processors available
> *num_processes = multiprocessing.cpu_count()
>
> *print 'Number of processes: %s' % (num_processes)
>
> *print 'Now trying to assign all the processors'
>
> *threads = []
>
> *len_stas = len(report_sta_list)
>
> *print "+++ Number of stations to process: %s" % (len_stas)
>
> *# run until all the threads are done, and there is no data left
> *while len(threads) < len(report_sta_list):
>
> * *# if we aren't using all the processors AND there is still data left to
> * *# compute, then spawn another thread
>
> * *print "+++ Starting to set off all child processes"
>
> * *if( len(threads) < num_processes ):
>
> * * *this_sta = report_sta_list.pop()
>
> * * *print "+++ Station is %s" % (this_sta)
>
> * * *p = multiprocessing.Process(target=work,args=['v',this_sta])
>
> * * *p.start()
>
> * * *print p, p.is_alive()
>
> * * *threads.append(p)
>
> * *else:
>
> * * *for thread in threads:
>
> * * * *if not thread.is_alive():
>
> * * * * *threads.remove(thread)
>
> However, I seem to be running into a whole series of errors:
>
> myhost{rt}62% controller.py
> ['B10A', 'B11A', 'BNLO']
> Number of processes: 64
> Now trying to assign all the processors
> +++ Number of stations to process: 3
> +++ Starting to set off all child processes
> +++ Station is BNLO
> <Process(Process-1, started)> True
> +++ Starting to set off all child processes
> +++ Station is B11A
> function: BNLO
> parent process: 22341
> process id: 22354
> ls /path/to/file/BNLO_info.pf
> <Process(Process-2, started)> True
> function: B11A
> parent process: 22341
> process id: 22355
> ls /path/to/file/B11A_info.pf
> Process Process-1:
> Traceback (most recent call last):
> *File "/opt/csw/lib/python/multiprocessing/process.py", line 231, in
> _bootstrap
> * *self.run()
> *File "/opt/csw/lib/python/multiprocessing/process.py", line 88, in run
> * *self._target(*self._args, **self._kwargs)
> *File "controller.py", line 104, in work
> * *return subprocess.call(cmd, shell=False)
> *File "/opt/csw/lib/python/subprocess.py", line 444, in call
> * *return Popen(*popenargs, **kwargs).wait()
> *File "/opt/csw/lib/python/subprocess.py", line 595, in __init__
> * *errread, errwrite)
> *File "/opt/csw/lib/python/subprocess.py", line 1092, in _execute_child
> * *raise child_exception
> OSError: [Errno 2] No such file or directory
> Process Process-2:
> Traceback (most recent call last):
> *File "/opt/csw/lib/python/multiprocessing/process.py", line 231, in
> _bootstrap
> * *self.run()
> *File "/opt/csw/lib/python/multiprocessing/process.py", line 88, in run
> * *self._target(*self._args, **self._kwargs)
> *File "controller.py", line 104, in work
> * *return subprocess.call(cmd, shell=False)
> *File "/opt/csw/lib/python/subprocess.py", line 444, in call
> * *return Popen(*popenargs, **kwargs).wait()
> *File "/opt/csw/lib/python/subprocess.py", line 595, in __init__
> * *errread, errwrite)
> *File "/opt/csw/lib/python/subprocess.py", line 1092, in _execute_child
> * *raise child_exception
> OSError: [Errno 2] No such file or directory
>
> The files are there:
>
> mhost{me}11% ls -la /path/to/files/BNLO_info.pf
> -rw-rw-r-- * 1 me * * * group * * 391 May 19 22:40
> /path/to/files/BNLO_info.pf
> myhost{me}12% ls -la /path/to/file/B11A_info.pf
> -rw-rw-r-- * 1 me * * * group * * 391 May 19 22:27
> /path/to/files/B11A_info.pf
>
> I might be doing this completely wrong, but I thought this would be the way
> to list the files dynamically. Admittedly this is just a stepping stone to
> running the actual shell script I want to run. Can anyone point me in the
> right direction or offer any advice for using these packages?
>
> Thanks in advance for any help or insight.
> - Rob
> --
> http://mail.python.org/mailman/listinfo/python-list
>


Piet van Oostrum 06-16-2009 09:20 PM

Re: Newbie help for using multiprocessing and subprocess packages for creating child processes
 
>>>>> Matt <HellZFury+Python@gmail.com> (M) wrote:

>M> Try replacing:
>M> cmd = [ "ls /path/to/file/"+staname+"_info.pf" ]
>M> with:
>M> cmd = [ β€œls”, β€œ/path/to/file/"+staname+"_info.pf" ]


In addition I would like to remark that -- if the only thing you want to
do is to start up a new command with subprocess.Popen -- the use of the
multiprocessing package is overkill. You could use threads as well.

Moreover, if you don't expect any output from these processes and don't
supply input to them through pipes there isn't even a need for these
threads. You could just use os.wait() to wait for a child to finish and
then start a new process if necessary.
--
Piet van Oostrum <piet@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: piet@vanoostrum.org

Mike Kazantsev 06-17-2009 02:33 AM

Re: Newbie help for using multiprocessing and subprocess packagesfor creating child processes
 
On Tue, 16 Jun 2009 23:20:05 +0200
Piet van Oostrum <piet@cs.uu.nl> wrote:

> >>>>> Matt <HellZFury+Python@gmail.com> (M) wrote:

>
> >M> Try replacing:
> >M> cmd = [ "ls /path/to/file/"+staname+"_info.pf" ]
> >M> with:
> >M> cmd = [ β€œls”, β€œ/path/to/file/"+staname+"_info.pf" ]

>
> In addition I would like to remark that -- if the only thing you want
> to do is to start up a new command with subprocess.Popen -- the use
> of the multiprocessing package is overkill. You could use threads as
> well.
>
> Moreover, if you don't expect any output from these processes and
> don't supply input to them through pipes there isn't even a need for
> these threads. You could just use os.wait() to wait for a child to
> finish and then start a new process if necessary.


And even if there is need to read/write data from/to the pipes more
than once (aka communicate), using threads or any more python
subprocesses seem like hammering a nail with sledgehammer - just _read_
or _write_ to pipes asynchronously.

--
Mike Kazantsev // fraggod.net



All times are GMT. The time now is 10:32 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.