Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > how to simulate tar filename substitution across pipedsubprocess.Popen() calls?

Reply
Thread Tools

how to simulate tar filename substitution across pipedsubprocess.Popen() calls?

 
 
jkn
Guest
Posts: n/a
 
      11-08-2012
Hi All
i am trying to build up a set of subprocess.Ponen calls to
replicate the effect of a horribly long shell command. I'm not clear
how I can do one part of this and wonder if anyone can advise. I'm on
Linux, fairly obviously.

I have a command which (simplified) is a tar -c command piped through
to xargs:

tar -czvf myfile.tgz -c $MYDIR mysubdir/ | xargs -I '{}' sh -c "test -
f $MYDIR/'{}'"

(The full command is more complicated than this; I got it from a shell
guru).

IIUC, when called like this, the two occurences of '{}' in the xargs
command will get replaced with the file being added to the tarfile.

Also IIUC, I will need two calls to subprocess.Popen() and use
subprocess.stdin on the second to receive the output from the first.
But how can I achive the substitution of the '{}' construction across
these two calls?

Apologies if I've made any howlers in this description - it's very
likely...

Cheers
J^n






 
Reply With Quote
 
 
 
 
Hans Mulder
Guest
Posts: n/a
 
      11-09-2012
On 8/11/12 19:05:11, jkn wrote:
> Hi All
> i am trying to build up a set of subprocess.Ponen calls to
> replicate the effect of a horribly long shell command. I'm not clear
> how I can do one part of this and wonder if anyone can advise. I'm on
> Linux, fairly obviously.
>
> I have a command which (simplified) is a tar -c command piped through
> to xargs:
>
> tar -czvf myfile.tgz -c $MYDIR mysubdir/ | xargs -I '{}' sh -c "test -
> f $MYDIR/'{}'"
>
> (The full command is more complicated than this; I got it from a shell
> guru).
>
> IIUC, when called like this, the two occurences of '{}' in the xargs
> command will get replaced with the file being added to the tarfile.
>
> Also IIUC, I will need two calls to subprocess.Popen() and use
> subprocess.stdin on the second to receive the output from the first.
> But how can I achive the substitution of the '{}' construction across
> these two calls?


That's what 'xargs' will do for you. All you need to do, is invoke
xargs with arguments containing '{}'. I.e., something like:

cmd1 = ['tar', '-czvf', 'myfile.tgz', '-c', mydir, 'mysubdir']
first_process = subprocess.Popen(cmd1, stdout=subprocess.PIPE)

cmd2 = ['xargs', '-I', '{}', 'sh', '-c', "test -f %s/'{}'" % mydir]
second_process = subprocess.Popen(cmd2, stdin=first_process.stdout)

> Apologies if I've made any howlers in this description - it's very
> likely...


I think the second '-c' argument to tar should have been a '-C'.

I'm not sure I understand what the second command is trying to
achieve. On my system, nothing happens, because tar writes the
names of the files it is adding to stderr, so xargs receives no
input at all. If I send the stderr from tar to the stdin of
xargs, then it still doesn't seem to do anything sensible.

Perhaps your real xargs command is more complicated and more
sensible.



Hope this helps,

-- HansM

 
Reply With Quote
 
 
 
 
jkn
Guest
Posts: n/a
 
      11-12-2012
Hi Hans
thanks a lot for your reply:

> That's what 'xargs' will do for you. *All you need to do, is invoke
> xargs with arguments containing '{}'. *I.e., something like:
>
> cmd1 = ['tar', '-czvf', 'myfile.tgz', '-c', mydir, 'mysubdir']
> first_process = subprocess.Popen(cmd1, stdout=subprocess.PIPE)
>
> cmd2 = ['xargs', '-I', '{}', 'sh', '-c', "test -f %s/'{}'" % mydir]
> second_process = subprocess.Popen(cmd2, stdin=first_process.stdout)
>


Hmm - that's pretty much what I've been trying. I will have to
experiment a bit more and post the results in a bit more detail.

> > Apologies if I've made any howlers in this description - it's very
> > likely...

>


> I think the second '-c' argument to tar should have been a '-C'.


You are correct, thanks. Serves me right for typing the simplified
version in by hand. I actually use the equivalent "--directory=..." in
the actual code.

> I'm not sure I understand what the second command is trying to
> achieve. *On my system, nothing happens, because tar writes the
> names of the files it is adding to stderr, so xargs receives no
> input at all. *If I send the stderr from tar to the stdin of
> xargs, then it still doesn't seem to do anything sensible.


That's interesting ... on my system, and all others that I know about,
the file list goes to stdout.

> Perhaps your real xargs command is more complicated and more
> sensible.


Yes, in fact the output from xargs is piped to a third process. But I
realise this doesn't alter the result of your experiment; the xargs
process should filter a subset of the files being fed to it.

I will experiment a bit more and hopefully post some results. Thanks
in the meantime...

Regards
Jon N

 
Reply With Quote
 
jkn
Guest
Posts: n/a
 
      11-12-2012
slight followup ...

I have made some progress; for now I'm using subprocess.communicate to
read the output from the first subprocess, then writing it into the
secodn subprocess. This way I at least get to see what is
happening ...

The reason 'we' weren't seeing any output from the second call (the
'xargs') is that as mentioned I had simplified this. The actual shell
command was more like (in python-speak):

"xargs -I {} sh -c \"test -f %s/{} && md5sum %s/{}\"" % (mydir, mydir)

ie. I am running md5sum on each tar-file entry which passes the 'is
this a file' test.

My next problem; how to translate the command-string clause

"test -f %s/{} && md5sum %s/{}" # ...

into s parameter to subprocss.Popen(). I think it's the command
chaining '&&' which is tripping me up...

Cheers
J^n



 
Reply With Quote
 
Hans Mulder
Guest
Posts: n/a
 
      11-12-2012
On 12/11/12 16:36:58, jkn wrote:
> slight followup ...
>
> I have made some progress; for now I'm using subprocess.communicate to
> read the output from the first subprocess, then writing it into the
> secodn subprocess. This way I at least get to see what is
> happening ...
>
> The reason 'we' weren't seeing any output from the second call (the
> 'xargs') is that as mentioned I had simplified this. The actual shell
> command was more like (in python-speak):
>
> "xargs -I {} sh -c \"test -f %s/{} && md5sum %s/{}\"" % (mydir, mydir)
>
> ie. I am running md5sum on each tar-file entry which passes the 'is
> this a file' test.
>
> My next problem; how to translate the command-string clause
>
> "test -f %s/{} && md5sum %s/{}" # ...
>
> into s parameter to subprocss.Popen(). I think it's the command
> chaining '&&' which is tripping me up...


It is not really necessary to translate the '&&': you can
just write:

"test -f '%s/{}' && md5sum '%s/{}'" % (mydir, mydir)

, and xargs will pass that to the shell, and then the shell
will interpret the '&&' for you: you have shell=False in your
subprocess.Popen call, but the arguments to xargs are -I {}
sh -c "....", and this means that xargs ends up invoking the
shell (after replacing the {} with the name of a file).

Alternatively, you could translate it as:

"if [ -f '%s/{}' ]; then md5sum '%s/{}'; fi" % (mydir, mydir)

; that might make the intent clearer to whoever gets to
maintain your code.


Hope this helps,

-- HansM
 
Reply With Quote
 
Rebelo
Guest
Posts: n/a
 
      11-12-2012
Dana četvrtak, 8. studenoga 2012. 19:05:12 UTC+1, korisnik jkn napisaoje:
> Hi All
>
> i am trying to build up a set of subprocess.Ponen calls to
>
> replicate the effect of a horribly long shell command. I'm not clear
>
> how I can do one part of this and wonder if anyone can advise. I'm on
>
> Linux, fairly obviously.
>
> J^n


You should try to do it in pure python, avoiding shell altogether.
The first step would be to actually write what it is you want to do.

To filter files you want to add to tar file check tarfile (http://docs.python.org/2/library/tar...module-tarfile),
specifically :
TarFile.add(name, arcname=None, recursive=True, exclude=None, filter=None)
which takes filter paramter :
"If filter is specified it must be a function that takes a TarInfo object argument and returns the changed TarInfo object. If it instead returns None the TarInfo object will be excluded from the archive."

 
Reply With Quote
 
jkn
Guest
Posts: n/a
 
      11-12-2012
Hi Hans

On Nov 12, 4:36*pm, Hans Mulder <han...@xs4all.nl> wrote:
> On 12/11/12 16:36:58, jkn wrote:
>
>
>
>
>
>
>
>
>
> > slight followup ...

>
> > I have made some progress; for now I'm using subprocess.communicate to
> > read the output from the first subprocess, then writing it into the
> > secodn subprocess. This way I at least get to see what is
> > happening ...

>
> > The reason 'we' weren't seeing any output from the second call (the
> > 'xargs') is that as mentioned I had simplified this. The actual shell
> > command was more like (in python-speak):

>
> > "xargs -I {} sh -c \"test -f %s/{} && md5sum %s/{}\"" % (mydir, mydir)

>
> > ie. I am running md5sum on each tar-file entry which passes the 'is
> > this a file' test.

>
> > My next problem; how to translate the command-string clause

>
> > * * "test -f %s/{} && md5sum %s/{}" # ...

>
> > into s parameter to subprocss.Popen(). I think it's the command
> > chaining '&&' which is tripping me up...

>
> It is not really necessary to translate the '&&': you can
> just write:
>
> * * "test -f '%s/{}' && md5sum '%s/{}'" % (mydir, mydir)
>
> , and xargs will pass that to the shell, and then the shell
> will interpret the '&&' for you: you have shell=False in your
> subprocess.Popen call, but the arguments to xargs are -I {}
> sh -c "....", and this means that xargs ends up invoking the
> shell (after replacing the {} with the name of a file).
>
> Alternatively, you could translate it as:
>
> * * "if [ -f '%s/{}' ]; then md5sum '%s/{}'; fi" % (mydir, mydir)
>
> ; that might make the intent clearer to whoever gets to
> maintain your code.


Yes to both points; turns out that my problem was in building up the
command sequence to subprocess.Popen() - when to use, and not use,
quotes etc. It has ended up as (spelled out in longhand...)


xargsproc = ['xargs']

xargsproc.append('-I')
xargsproc.append("{}")

xargsproc.append('sh')
xargsproc.append('-c')

xargsproc.append("test -f %s/{} && md5sum %s/{}" % (mydir,
mydir))


As usual, breaking it all down for the purposes of clarification has
helpd a lot, as has your input. Thanks a lot.

Cheers
Jon N
 
Reply With Quote
 
jkn
Guest
Posts: n/a
 
      11-12-2012
On Nov 12, 4:58*pm, Rebelo <puntabl...@gmail.com> wrote:
> Dana četvrtak, 8. studenoga 2012. 19:05:12 UTC+1, korisnik jkn napisao je:
>
> > Hi All

>
> > * * i am trying to build up a set of subprocess.Ponen calls to

>
> > replicate the effect of a horribly long shell command. I'm not clear

>
> > how I can do one part of this and wonder if anyone can advise. I'm on

>
> > Linux, fairly obviously.

>
> > * * J^n

>
> You should try to do it in pure python, avoiding shell altogether.
> The first step would be to actually write what it is you want to do.
>


Hi Rebelo
FWIW I intend to do exactly this - but I wanted to duplicate the
existing shell action beforehand, so that I could get rid of the shell
command.

After I've tidied things up, that will be my next step.

Cheers
Jon N



 
Reply With Quote
 
Hans Mulder
Guest
Posts: n/a
 
      11-12-2012
On 12/11/12 18:22:44, jkn wrote:
> Hi Hans
>
> On Nov 12, 4:36 pm, Hans Mulder <han...@xs4all.nl> wrote:
>> On 12/11/12 16:36:58, jkn wrote:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>> slight followup ...

>>
>>> I have made some progress; for now I'm using subprocess.communicate to
>>> read the output from the first subprocess, then writing it into the
>>> secodn subprocess. This way I at least get to see what is
>>> happening ...

>>
>>> The reason 'we' weren't seeing any output from the second call (the
>>> 'xargs') is that as mentioned I had simplified this. The actual shell
>>> command was more like (in python-speak):

>>
>>> "xargs -I {} sh -c \"test -f %s/{} && md5sum %s/{}\"" % (mydir, mydir)

>>
>>> ie. I am running md5sum on each tar-file entry which passes the 'is
>>> this a file' test.

>>
>>> My next problem; how to translate the command-string clause

>>
>>> "test -f %s/{} && md5sum %s/{}" # ...

>>
>>> into s parameter to subprocss.Popen(). I think it's the command
>>> chaining '&&' which is tripping me up...

>>
>> It is not really necessary to translate the '&&': you can
>> just write:
>>
>> "test -f '%s/{}' && md5sum '%s/{}'" % (mydir, mydir)
>>
>> , and xargs will pass that to the shell, and then the shell
>> will interpret the '&&' for you: you have shell=False in your
>> subprocess.Popen call, but the arguments to xargs are -I {}
>> sh -c "....", and this means that xargs ends up invoking the
>> shell (after replacing the {} with the name of a file).
>>
>> Alternatively, you could translate it as:
>>
>> "if [ -f '%s/{}' ]; then md5sum '%s/{}'; fi" % (mydir, mydir)
>>
>> ; that might make the intent clearer to whoever gets to
>> maintain your code.

>
> Yes to both points; turns out that my problem was in building up the
> command sequence to subprocess.Popen() - when to use, and not use,
> quotes etc. It has ended up as (spelled out in longhand...)
>
>
> xargsproc = ['xargs']
>
> xargsproc.append('-I')
> xargsproc.append("{}")
>
> xargsproc.append('sh')
> xargsproc.append('-c')
>
> xargsproc.append("test -f %s/{} && md5sum %s/{}" % (mydir,
> mydir))


This will break if there are spaces in the file name, or other
characters meaningful to the shell. If you change if to

xargsproc.append("test -f '%s/{}' && md5sum '%s/{}'"
% (mydir, mydir))

, then it will only break if there are single quotes in the file name.

As I understand, your plan is to rewrite this bit in pure Python, to
get rid of any and all such problems.

> As usual, breaking it all down for the purposes of clarification has
> helpd a lot, as has your input. Thanks a lot.


You're welcome.

-- HansM


 
Reply With Quote
 
jkn
Guest
Posts: n/a
 
      11-12-2012
Hi Hans

[...]
>
> > * * * * xargsproc.append("test -f %s/{} && md5sum %s/{}" % (mydir,
> > mydir))

>
> This will break if there are spaces in the file name, or other
> characters meaningful to the shell. *If you change if to
>
> * * * * xargsproc.append("test -f '%s/{}' && md5sum '%s/{}'"
> * * * * * * * * * * * * * * *% (mydir, mydir))
>
> , then it will only break if there are single quotes in the file name.


Fair point. As it happens, I know that there are no 'unhelpful'
characters in the filenames ... but it's still worth doing.

>
> As I understand, your plan is to rewrite this bit in pure Python, to
> get rid of any and all such problems.


Yep - as mentioned in another reply I wanted first to have something
which duplicated the current action (which has taken longer than I
expected), and then rework in a more pythonic way.

Still, I've learned some things about the subprocess module, and also
about the shell, so it's been far from wasted time.

Regards
Jon N
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
Re: Pipelining tar create and tar extract the "Python" way... Ray Van Dolson Python 0 09-25-2009 03:48 PM
Pipelining tar create and tar extract the "Python" way... Ray Van Dolson Python 0 09-23-2009 10:52 PM
os.system('tar -c * | tar -C dst') ##Any other suggestions... list.repository@gmail.com Python 2 04-24-2007 10:29 PM
Version of TAR in tarfile module? TAR 1.14 or 1.15 port to Windows? Claudio Grondi Python 4 08-20-2005 08:01 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57