Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Sun Grid Engine / NFS and Python shell execution question

Reply
Thread Tools

Sun Grid Engine / NFS and Python shell execution question

 
 
J.B. Brown
Guest
Posts: n/a
 
      07-22-2010
Hello everyone, and thanks for your time to read this.

For quite some time, I have had a problem using Python's shell
execution facilities in combination with a cluster computer
environment (such as Sun Grid Engine (SGE)).
In particular, I wish to repeatedly execute a number of commands in
sub-shells or pipes within a single function, and the repeated
execution is depending on the previous execution, so just writing a
brute force script file and executing commands is not an option for
me.

To isolate and exemplify my problem, I have created three files:
(1) one which exemplifies the spirit of the code I wish to execute in Python
(2) one which serves as the SGE execution script file, and actually
calls python to execute the code in (1)
(3) a simple shell script which executes (2) a sufficient number of
times that it fills all processors on my computing cluster and leaves
an additional number of jobs in the queue.

Here is the spirit of the experiment/problem:
generateTest.py:
----------------------------------------------
# Constants
numParallelJobs = 100
testCommand = "continue" #"os.popen( \"clear\" )"
loopSize = "1000"

# First, write file with test script.
pythonScript = file( "testScript.py", "w" )
pythonScript.write(
"""
import os
for i in range( 0, """ + loopSize + """ ):
for j in range( 0, """ + loopSize + """ ):
for k in range( 0, """ + loopSize + """ ):
for l in range( 0, """ + loopSize + """ ):
""" + testCommand + """
""" )
pythonScript.close()

# Second, write SGE script file to execute the Python script.
sgeScript = file( "testScript.sge", "w" )
sgeScript.write (
"""
#$ -cwd
#$ -N pythonTest
#$ -e /export/home/jbbrown/errorLog
#$ -o /export/home/jbbrown/outputLog
python testScript.py
""" )
sgeScript.close()

# Finally, write script to run SGE script a specified number of times.
import os
launchScript = file( "testScript.sh", "w" )
for i in range( 0, numParallelJobs ):
launchScript.write( "qsub testScript.sge" + os.linesep )
launchScript.close()

----------------------------------------------

Now, let's assume that I have about 50 processors available across 8
compute nodes, with one NFS-mounted disk.
If I run the code as above, simply executing Python "continue"
statements and do nothing, the cluster head node reports no serious
NFS daemon load.

However - if I change the code to use the os.popen() call shown as a
comment above, or use os.system(),
the NFS daemon load on my system skyrockets within seconds of
distributing the jobs to the compute nodes -- even though I'm doing
nothing but executing the clear screen command, which technically
doesn't pipe any output to the location for logging stdout.
Even if I change the SGE script file to redirect standard output and
error to explicitly go to /dev/null, I still have the same problem.

I believe the source of this problem is that os.popen() or os.system()
calls spawn subshells which then reference my shell resource files
(.zshrc, .cshrc, .bashrc, etc.).
But I don't see an alternative to os.popen{234} or os.system().
os.exec*() cannot solve my problem, because it transfers execution to
that program and stops executing the script which called os.exec*().

Without having to rewrite a considerable amount of code (which
performs cross validation by repeatedly executing in a subshell) in
terms of a shell script language filled with a large number of
conditional statements, does anyone know of a way to execute external
programs in the middle of a script without referencing the shell
resource file located on an NFS mounted directory?
I have read through the >help(os) documentation repeatedly, but just
can't find a solution.

Even a small lead or thought would be greatly appreciated.

With thanks from humid Kyoto,
J.B. Brown
 
Reply With Quote
 
 
 
 
Neil Hodgson
Guest
Posts: n/a
 
      07-22-2010
J.B. Brown:

> I believe the source of this problem is that os.popen() or os.system()
> calls spawn subshells which then reference my shell resource files
> (.zshrc, .cshrc, .bashrc, etc.).
> But I don't see an alternative to os.popen{234} or os.system().
> os.exec*() cannot solve my problem, because it transfers execution to
> that program and stops executing the script which called os.exec*().


Call fork then call exec from the new process. Search the web for
"fork exec" to find examples in C.

Neil
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Sun Grid Engine / NFS and Python shell execution question MRAB Python 0 07-22-2010 03:31 PM
Omni-NFS NFS Server Buffer Overflow Ron Martell Computer Support 0 11-07-2006 09:02 PM
Sun Updates Sun Java Availability Suite, Adds Sun Cluster Advanced Edition for Oracle Real Application Clusters Deployments technology_post@yahoo.com Java 0 04-05-2006 04:29 AM
Python and file locking - NFS or MySQL? Christopher DeMarco Python 0 08-29-2005 03:09 PM
Perl and Sun Grid Engine (SGE) Leo Perl Misc 3 06-18-2004 05:09 PM



Advertisments