Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > implementing python's os.walk

Reply
Thread Tools

implementing python's os.walk

 
 
Brad Volz
Guest
Posts: n/a
 
      12-16-2008
Hello,

I seem to be having some difficulty creating a version of python's
os.walk() for ruby, and I was hoping for some pointers.

As some background, python's os.walk() [1] is a generator function.
It is passed the top of a directory tree and it returns the following
for each subdirectory that it encounters:
. the working directory
. an Array of subdirectories
. an Array of non-directory files

Here is some truncated output for my use case:

>>> import os
>>> repo = '/usr/local/nfsen/profiles-data/live/lax1er1'
>>> for root, dirs, files in os.walk(repo):

... if len(files) == 288:
... print root
...
/usr/local/nfsen/profiles-data/live/lax1er1/2008/11/11
..
/usr/local/nfsen/profiles-data/live/lax1er1/2008/12/13

Essentially, when there are exactly 288 files in a subdirectory, I
want to print or otherwise do something with the working directory.

Here is my attempt at simply translating this library function to ruby:

#! /usr/bin/env ruby

require 'pp'

dirs = [ '/usr/local/nfsen/profiles-data/live/lax1er1' ]

def find_dirs(top)
dirs = []
nondirs = []
Dir.entries(top).each do |f|
next if f =~ /(\.$|\.\.$)/
full_path = [top, f].join('/')
if File.directory?(full_path)
dirs.push(full_path)
else
nondirs.push(full_path)
end
end

yield top, dirs, nondirs

dirs.each do |d|
if File.directory?(d)
for o in find_dirs(d) { |a,b,c| puts "#{a} #{b} #{c}"}
yield o
end
end
end
end

find_dirs(dirs[0]) do |top,dirs,nondirs|
if nondirs.length == 288
puts "#{top}"
end
end

There are some things that I know are wrong or missing, but that is
due to trying to get something to run at all without throwing an
exception.

The part that I think is totally wrong is:

for o in find_dirs(d) { |a,b,c| puts "#{a} #{b} #{c}"}

It's really only in there currently to keep the from getting
'LocalJumpError: no block given.' Unfortunately, I have no idea what
the correct way to deal with this would be.

The missing part would be including the directory contents in addition
to the working directory and the Array of subdirectories.

So, I guess my main questions would be: What do I need to do to get
this sort of a generator to work? Do I need to wrap this up in a
'class' or is a 'def' sufficient? What should the block look like,
and where should it be in the code?

Thanks for reading,

Brad

[1] http://svn.python.org/view/python/br...2757&view=auto

 
Reply With Quote
 
 
 
 
Hugh Sasse
Guest
Posts: n/a
 
      12-16-2008
On Tue, 16 Dec 2008, Brad Volz wrote:

> Hello,
>
> I seem to be having some difficulty creating a version of python's os.walk()
> for ruby, and I was hoping for some pointers.
>
> As some background, python's os.walk() [1] is a generator function. It is


You can have generators in Ruby
ri Generator
furnishes you with the details...

> passed the top of a directory tree and it returns the following for each
> subdirectory that it encounters:
> . the working directory
> . an Array of subdirectories
> . an Array of non-directory files
>
> Here is some truncated output for my use case:
>
> > > > import os
> > > > repo = '/usr/local/nfsen/profiles-data/live/lax1er1'
> > > > for root, dirs, files in os.walk(repo):

> ... if len(files) == 288:
> ... print root
> ...
> /usr/local/nfsen/profiles-data/live/lax1er1/2008/11/11
> ..
> /usr/local/nfsen/profiles-data/live/lax1er1/2008/12/13
>
> Essentially, when there are exactly 288 files in a subdirectory, I want to
> print or otherwise do something with the working directory.


OK.
>
> Here is my attempt at simply translating this library function to ruby:
>
> #! /usr/bin/env ruby
>
> require 'pp'
>
> dirs = [ '/usr/local/nfsen/profiles-data/live/lax1er1' ]
>
> def find_dirs(top)
> dirs = []
> nondirs = []
> Dir.entries(top).each do |f|
> next if f =~ /(\.$|\.\.$)/


or maybe
next if f =~ /^\.\.?$/
or
next if f =~ /^\.{1,2}$/

> full_path = [top, f].join('/')


full_path = File.join(tmp,f) # separator agnostic

> if File.directory?(full_path)
> dirs.push(full_path)
> else
> nondirs.push(full_path)
> end
> end
>
> yield top, dirs, nondirs


yielding to a proc with arity 3....

>
> dirs.each do |d|
> if File.directory?(d)
> for o in find_dirs(d) { |a,b,c| puts "#{a} #{b} #{c}"}
> yield o


yielding to a proc with arity 1

That may be one problem

> end
> end
> end
> end
>
> find_dirs(dirs[0]) do |top,dirs,nondirs|
> if nondirs.length == 288
> puts "#{top}"
> end
> end
>
> There are some things that I know are wrong or missing, but that is due to
> trying to get something to run at all without throwing an exception.


ri Find is short enough to quote:

------------------------------------------------------------ Class: Find
The +Find+ module supports the top-down traversal of a set of file
paths.

For example, to total the size of all files under your home
directory, ignoring anything in a "dot" directory (e.g.
$HOME/.ssh):

require 'find'

total_size = 0

Find.find(ENV["HOME"]) do |path|
if FileTest.directory?(path)
if File.basename(path)[0] == ?.
Find.prune # Don't look any further into this directory.
else
next
end
else
total_size += FileTest.size(path)
end
end

------------------------------------------------------------------------


Instance methods:
-----------------
find, prune


That will do most of the lifting for you...
[...]
Hugh

 
Reply With Quote
 
 
 
 
Brian Candler
Guest
Posts: n/a
 
      12-16-2008
Brad Volz wrote:
> As some background, python's os.walk() [1] is a generator function.
> It is passed the top of a directory tree and it returns the following
> for each subdirectory that it encounters:
> . the working directory
> . an Array of subdirectories
> . an Array of non-directory files


The normal 'ruby way' to do this would be as an object which *yields*
each of these things in turn, rather than returning them.

In many cases you can use it directly like this. If you want to turn it
into a generator you can wrap it using generator.rb; or more cleanly in
ruby 1.9, turn it into an Enumerator, which has this functionality built
in.

class Foo
def all_dirs
yield "dir1"
yield "dir2"
yield "dir3"
end
end

foo = Foo.new

# Normal style
foo.all_dirs { |x| p x }

# Generator style (ruby 1.9, uses Fiber)
g = foo.to_enum(:all_dirs)
3.times { p g.next }

# Generator style (ruby 1.8, beware uses callcc)
require 'generator'
require 'enumerator'
g = Generator.new(foo.to_enum(:all_dirs))
3.times { p g.next }
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Urabe Shyouhei
Guest
Posts: n/a
 
      12-16-2008
Brad Volz wrote:
> Hello,
>
> I seem to be having some difficulty creating a version of python's
> os.walk() for ruby, and I was hoping for some pointers.


You should really know how to use Ruby's stdlib named "find".
http://www.ruby-doc.org/stdlib/libdo...doc/index.html

Below is another twisted example in Ruby 1.9. It can be used much like python's
generators:

irb(main):060:0> for root, dirs, files in os_walk("/dev/disk") do
irb(main):061:1* puts root
irb(main):062:1> end



require 'find'
require 'pathname'

def os_walk(dir)
ret = Fiber.new do
Pathname(dir).find do |ent|
next unless ent.directory?
dirs, files = ent.children.partition {|i| i.directory? }
Fiber.yield ent, dirs, files
end
raise
end
def ret.each
loop { yield resume } rescue self
end
ret
end

 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      12-16-2008
2008/12/16 Brad Volz <(E-Mail Removed)>:
> I seem to be having some difficulty creating a version of python's os.walk()
> for ruby, and I was hoping for some pointers.
>
> As some background, python's os.walk() [1] is a generator function. It is
> passed the top of a directory tree and it returns the following for each
> subdirectory that it encounters:
> . the working directory
> . an Array of subdirectories
> . an Array of non-directory files
>
> Here is some truncated output for my use case:
>
>>>> import os
>>>> repo = '/usr/local/nfsen/profiles-data/live/lax1er1'
>>>> for root, dirs, files in os.walk(repo):

> ... if len(files) == 288:
> ... print root
> ...
> /usr/local/nfsen/profiles-data/live/lax1er1/2008/11/11
> ..
> /usr/local/nfsen/profiles-data/live/lax1er1/2008/12/13
>
> Essentially, when there are exactly 288 files in a subdirectory, I want to
> print or otherwise do something with the working directory.


> The part that I think is totally wrong is:
>
> for o in find_dirs(d) { |a,b,c| puts "#{a} #{b} #{c}"}
>
> It's really only in there currently to keep the from getting
> 'LocalJumpError: no block given.' Unfortunately, I have no idea what the
> correct way to deal with this would be.
>
> The missing part would be including the directory contents in addition to
> the working directory and the Array of subdirectories.
>
> So, I guess my main questions would be: What do I need to do to get this
> sort of a generator to work? Do I need to wrap this up in a 'class' or is a
> 'def' sufficient? What should the block look like, and where should it be
> in the code?


You have a recursive algorithm here but you want each call of the
method invoke the *same block*. This can be achieved by forwarding
the block with the &b notation:

def find_dirs(top, &b)
...
# enter recursion
find_dirs(d, &b)
end

The way you did it, every invocation yields to the caller's block
which is only the correct one for the first caller; all others yield
to the block in their parent calling find_dirs.

You might as well be able to create a totally different solution using Find:

require 'find'

roots.each do |root|
dir_count = Hash.new 0

Find.find root do |file|
d, f = File.split file
next if /\A\.{1,2}\z/ =~ f
dir_count[d] += 1 if File.file? file
end

dir_count.each do |dir, cnt|
puts root if cnt == 288
end
end

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end

 
Reply With Quote
 
Brad Volz
Guest
Posts: n/a
 
      12-17-2008

Many thanks for the suggestions.

I wasn't previously aware of Find, I have been able to get it to
provide all of the information that I need.

Thanks again!

Brad

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Implementing Interface Gomathi ASP .Net 1 11-17-2005 03:09 PM
Need help implementing a proj on SPARTAN3 Riccardo Fregonese VHDL 2 01-03-2005 01:21 PM
Implementing the CORDIC algorithm without using Real Data Type Johnsy Joseph VHDL 2 10-29-2004 10:49 AM
Implementing E1 - E3 Dev VHDL 1 09-09-2004 09:06 AM
vhdl for implementing pre-fetch and an instruction cache Eqbal Z VHDL 3 11-16-2003 06:07 AM



Advertisments