Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > postprocessing in os.walk

Reply
Thread Tools

postprocessing in os.walk

 
 
kj
Guest
Posts: n/a
 
      10-10-2009



Perl's directory tree traversal facility is provided by the function
find of the File::Find module. This function accepts an optional
callback, called postprocess, that gets invoked "just before leaving
the currently processed directory." The documentation goes on to
say "This hook is handy for summarizing a directory, such as
calculating its disk usage", which is exactly what I use it for in
a maintenance script.

This maintenance script is getting long in the tooth, and I've been
meaning to add a few enhancements to it for a while, so I thought
that in the process I'd port it to Python, using the os.walk
function, but I see that os.walk does not have anything like this
File::Find::find's postprocess hook. Is there a good way to simulate
it (without having to roll my own File::Find::find in Python)?

TIA!

kynn
 
Reply With Quote
 
 
 
 
jordilin
Guest
Posts: n/a
 
      10-11-2009
Well, you could use the alternative os.path.walk instead. You can pass
a callback as a parameter, which will be invoked every time you
bump into a new directory. The signature is os.path.walk
(path,visit,arg). Take a look at the python library documentation.


On 11 Oct, 00:12, kj <(E-Mail Removed)> wrote:
> Perl's directory tree traversal facility is provided by the function
> find of the File::Find module. *This function accepts an optional
> callback, called postprocess, that gets invoked "just before leaving
> the currently processed directory." *The documentation goes on to
> say "This hook is handy for summarizing a directory, such as
> calculating its disk usage", which is exactly what I use it for in
> a maintenance script.
>
> This maintenance script is getting long in the tooth, and I've been
> meaning to add a few enhancements to it for a while, so I thought
> that in the process I'd port it to Python, using the os.walk
> function, but I see that os.walk does not have anything like this
> File::Find::find's postprocess hook. *Is there a good way to simulate
> it (without having to roll my own File::Find::find in Python)?
>
> TIA!
>
> kynn


 
Reply With Quote
 
 
 
 
Dave Angel
Guest
Posts: n/a
 
      10-12-2009
kj wrote:
> Perl's directory tree traversal facility is provided by the function
> find of the File::Find module. This function accepts an optional
> callback, called postprocess, that gets invoked "just before leaving
> the currently processed directory." The documentation goes on to
> say "This hook is handy for summarizing a directory, such as
> calculating its disk usage", which is exactly what I use it for in
> a maintenance script.
>
> This maintenance script is getting long in the tooth, and I've been
> meaning to add a few enhancements to it for a while, so I thought
> that in the process I'd port it to Python, using the os.walk
> function, but I see that os.walk does not have anything like this
> File::Find::find's postprocess hook. Is there a good way to simulate
> it (without having to roll my own File::Find::find in Python)?
>
> TIA!
>
> kynn
>
>

Why would you need a special hook when the os.walk() generator yields
exactly once per directory? So whatever work you do on the list of
files you get, you can then put the summary logic immediately after.

Or if you really feel you need a special hook, then write a wrapper for
os.walk(), which takes a hook function as a parameter, and after
yielding each file in a directory, calls the hook. Looks like about 5
lines.

DaveA
 
Reply With Quote
 
kj
Guest
Posts: n/a
 
      10-13-2009
In <(E-Mail Removed)> Dave Angel <(E-Mail Removed)> writes:

>kj wrote:
>> Perl's directory tree traversal facility is provided by the function
>> find of the File::Find module. This function accepts an optional
>> callback, called postprocess, that gets invoked "just before leaving
>> the currently processed directory." The documentation goes on to
>> say "This hook is handy for summarizing a directory, such as
>> calculating its disk usage", which is exactly what I use it for in
>> a maintenance script.
>>
>> This maintenance script is getting long in the tooth, and I've been
>> meaning to add a few enhancements to it for a while, so I thought
>> that in the process I'd port it to Python, using the os.walk
>> function, but I see that os.walk does not have anything like this
>> File::Find::find's postprocess hook. Is there a good way to simulate
>> it (without having to roll my own File::Find::find in Python)?
>>
>> TIA!
>>
>> kynn
>>
>>

>Why would you need a special hook when the os.walk() generator yields
>exactly once per directory? So whatever work you do on the list of
>files you get, you can then put the summary logic immediately after.


>Or if you really feel you need a special hook, then write a wrapper for
>os.walk(), which takes a hook function as a parameter, and after
>yielding each file in a directory, calls the hook. Looks like about 5
>lines.


I think you're missing the point. The hook in question has to be
called *immediately after* all the subtrees that are rooted in
subdirectories contained in the current directory have been visited
by os.walk.

I'd love to see your "5 lines" for *that*.

kj
 
Reply With Quote
 
Peter Otten
Guest
Posts: n/a
 
      10-13-2009
kj wrote:

> In <(E-Mail Removed)> Dave Angel
> <(E-Mail Removed)> writes:
>
>>kj wrote:
>>> Perl's directory tree traversal facility is provided by the function
>>> find of the File::Find module. This function accepts an optional
>>> callback, called postprocess, that gets invoked "just before leaving
>>> the currently processed directory." The documentation goes on to
>>> say "This hook is handy for summarizing a directory, such as
>>> calculating its disk usage", which is exactly what I use it for in
>>> a maintenance script.
>>>
>>> This maintenance script is getting long in the tooth, and I've been
>>> meaning to add a few enhancements to it for a while, so I thought
>>> that in the process I'd port it to Python, using the os.walk
>>> function, but I see that os.walk does not have anything like this
>>> File::Find::find's postprocess hook. Is there a good way to simulate
>>> it (without having to roll my own File::Find::find in Python)?
>>>
>>> TIA!
>>>
>>> kynn
>>>
>>>

>>Why would you need a special hook when the os.walk() generator yields
>>exactly once per directory? So whatever work you do on the list of
>>files you get, you can then put the summary logic immediately after.

>
>>Or if you really feel you need a special hook, then write a wrapper for
>>os.walk(), which takes a hook function as a parameter, and after
>>yielding each file in a directory, calls the hook. Looks like about 5
>>lines.

>
> I think you're missing the point. The hook in question has to be
> called *immediately after* all the subtrees that are rooted in
> subdirectories contained in the current directory have been visited
> by os.walk.
>
> I'd love to see your "5 lines" for *that*.


import os

def find(root, process):
for pdf in os.walk(root, topdown=False):
process(*pdf)

def process(path, dirs, files):
print path

find(".", process)

Peter

 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      10-13-2009
kj <(E-Mail Removed)> writes:
> I think you're missing the point. The hook in question has to be
> called *immediately after* all the subtrees that are rooted in
> subdirectories contained in the current directory have been visited
> by os.walk.
>
> I'd love to see your "5 lines" for *that*.


I'm having trouble understanding the specification. To find the disk
usage (in bytes) of a directory:

import os,stat
def find_disk_usage(dirname):
return sum(sum(os.stat(dirpath+'/'+filename)[stat.ST_SIZE]
for filename in fn_list)
for dirpath, dirlist, fn_list in os.walk(dirname))

 
Reply With Quote
 
Dave Angel
Guest
Posts: n/a
 
      10-13-2009
Peter Otten wrote:
> kj wrote:
>
>
>> In <(E-Mail Removed)> Dave Angel
>> <(E-Mail Removed)> writes:
>>
>>
>>> kj wrote:
>>>
>>>> Perl's directory tree traversal facility is provided by the function
>>>> find of the File::Find module. This function accepts an optional
>>>> callback, called postprocess, that gets invoked "just before leaving
>>>> the currently processed directory." The documentation goes on to
>>>> say "This hook is handy for summarizing a directory, such as
>>>> calculating its disk usage", which is exactly what I use it for in
>>>> a maintenance script.
>>>>
>>>> This maintenance script is getting long in the tooth, and I've been
>>>> meaning to add a few enhancements to it for a while, so I thought
>>>> that in the process I'd port it to Python, using the os.walk
>>>> function, but I see that os.walk does not have anything like this
>>>> File::Find::find's postprocess hook. Is there a good way to simulate
>>>> it (without having to roll my own File::Find::find in Python)?
>>>>
>>>> TIA!
>>>>
>>>> kynn
>>>>
>>>>
>>>>
>>> Why would you need a special hook when the os.walk() generator yields
>>> exactly once per directory? So whatever work you do on the list of
>>> files you get, you can then put the summary logic immediately after.
>>>
>>> Or if you really feel you need a special hook, then write a wrapper for
>>> os.walk(), which takes a hook function as a parameter, and after
>>> yielding each file in a directory, calls the hook. Looks like about 5
>>> lines.
>>>

>> I think you're missing the point. The hook in question has to be
>> called *immediately after* all the subtrees that are rooted in
>> subdirectories contained in the current directory have been visited
>> by os.walk.
>>
>> I'd love to see your "5 lines" for *that*.
>>

>
> import os
>
> def find(root, process):
> for pdf in os.walk(root, topdown=False):
> process(*pdf)
>
> def process(path, dirs, files):
> print path
>
> find(".", process)
>
> Peter
>
>
>
>

Thanks Peter,

To expand it to five lines, and make it the generator I mentioned,

import os

def find(root, process):
for pdf in os.walk(root, topdown=False):
for file in pdf[2]:
yield os.path.join(pdf[0],file)
process(*pdf)

def process(path, dirs, files):
print "hooked --", path

for fullpath in find("..", process):
print fullpath


This is a generator which yields each file in a directory tree, and
after all the files below a particular directory are processed,
"immediately" calls the hook

DaveA
 
Reply With Quote
 
Ethan Furman
Guest
Posts: n/a
 
      10-13-2009
kj wrote:
> In <(E-Mail Removed)> Dave Angel <(E-Mail Removed)> writes:
>>


[snippetty snip]

>>Why would you need a special hook when the os.walk() generator yields
>>exactly once per directory? So whatever work you do on the list of
>>files you get, you can then put the summary logic immediately after.

>
>
> I think you're missing the point. The hook in question has to be
> called *immediately after* all the subtrees that are rooted in
> subdirectories contained in the current directory have been visited
> by os.walk.
>
> I'd love to see your "5 lines" for *that*.
>
> kj


So now that you've seen a couple examples, perhaps you noticed the flag
"topdown=False"? With that (un)set, I repeat the question -- why do you
need a hook?

~Ethan~
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off




Advertisments