Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Handling and recursing subdirectories

Reply
Thread Tools

Handling and recursing subdirectories

 
 
Kloudnyne
Guest
Posts: n/a
 
      09-03-2004
I'm a newcomer to Perl, and am currently attempting to teach myself
Perl through using it, but I have currently come across an issue I
can't seem to see any way around.

I am trying to write a Perl script that will go through a series of
directories and their subdirectories, removing javascript, images,
bots, etc from HTML files in order to provide a text-reader friendly
version of each page. The actual conversion of any given file has been
taken care of, thanks to code heavily borrowed from an existing
script, but I can't seem to work out how I can get it to recurse
through the various subdirs.

The snippet of code I've thrown together for it so far is:

***

sub reading {
do {
opendir (CURRENTFOLDER, $htmdir) || die 'Ay Seņor! Los bandidos have
raided that directory!'
while defined($filename = readdir(FOLDER)) = True {
$nesting = directorycheck(); #nesting tells me how deep we are into
subdirectories. a zero value is at the root of the process. only
really intended as a flag for testing
#dircheck checks to see if our victim this cycle is a
subdirectory. If it is, (I hope) we'll launch in to a nested subcycle.
}

closedir(CURRENTFOLDER); # with a little luck, re-opening the
previous folder will have the pointer still at the last position
checked, or else we have uber-recursives
chop ($htmdir); #prepping the string to ensure that the trailing char
is NOT a / (not that it should be anyway)
do {

}
until (chop($htmdir) ne '/');
# now that we've gone back to (and removed) the / nearest to the end
of the handle, we've effectively gone back to the parent directory
$nesting -- ;
}
until $htmdir = $htmroot;
}


sub directorycheck {
if (-d $filename) {
dircheck = $nesting + 1 ;
$htmdir = $filename .=$htmdir;
chdir ($htmdir);
} else {
$txtdir = $htmdir; # sets $txtdir to mirror $htmdir, but in the
/txt/ directory, where we want our output to be.
$txtdir =~s/htdocs/txt/; #(hopefully) changes the file path for
output to the /txt/ equivalent of the current /htdocs/ folder
parsetxt(); # only parses if we've hit a file, rather than a subdir.
}
}

***

where "parsetxt" is the subroutine that handles the actual conversion.
However, I can't even get this to compile, let alone run it to see if
it just dies or recurses away to infinity, or whatever.

The script is intended to run on a linux box acting as a webserver,
but for purposes of writing/testing I'm using ActivePerl 5.8 on a
win2k machine.

My question, after all this explanation, is this: Am I barking up the
wrong tree here, or am I just missing one little thing that will make
all this work? If anyone else has a piece of code that will fulfil my
requirements and make my life easier, you will have my undying
gratitude, because at this point I'm seriously starting to reconsider
scripting and just perform the conversions manually.

Thanks for your time.


PS: I apologise for the hideous formatting. It's actually quite
legible on a full-width screen, and I didn't want to disturb the text
for fear of accidentally altering the code.
 
Reply With Quote
 
 
 
 
Paul Lalli
Guest
Posts: n/a
 
      09-03-2004
"Kloudnyne" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) om...
> I am trying to write a Perl script that will go through a series of
> directories and their subdirectories, removing javascript, images,
> bots, etc from HTML files in order to provide a text-reader friendly
> version of each page. The actual conversion of any given file has been
> taken care of, thanks to code heavily borrowed from an existing
> script, but I can't seem to work out how I can get it to recurse
> through the various subdirs.
>
> The snippet of code I've thrown together for it so far is:


<snip attempt at manual directory recursion>

> My question, after all this explanation, is this: Am I barking up the
> wrong tree here, or am I just missing one little thing that will make
> all this work? If anyone else has a piece of code that will fulfil my
> requirements and make my life easier, you will have my undying
> gratitude, because at this point I'm seriously starting to reconsider
> scripting and just perform the conversions manually.


The standard (that is, included with your Perl distibution) module
File::Find is what you want to use to recurse through directories. Read
about it by typing the command
perldoc File::Find
at your shell prompt. The CPAN modules File::Finder and
File::Find::Rule also exist if you prefer an alternate syntax.

In the more general case, whenever you find yourself trying to do
something in Perl that has most likely done before (surely you don't
think you're the only one who's ever needed to recurse through a
directory structure, do you?), you should always check to see if a
module exists which already does it. Modules are stored and shared on
the CPAN, which you can search at http://search.cpan.org

Give File::Find a shot, and if you have problems with it, feel free to
ask for help.

Paul Lalli


 
Reply With Quote
 
 
 
 
Paul Lalli
Guest
Posts: n/a
 
      09-03-2004
"Kloudnyne" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) om...
> I am trying to write a Perl script that will go through a series of
> directories and their subdirectories, removing javascript, images,
> bots, etc from HTML files in order to provide a text-reader friendly
> version of each page. The actual conversion of any given file has been
> taken care of, thanks to code heavily borrowed from an existing
> script, but I can't seem to work out how I can get it to recurse
> through the various subdirs.
>
> The snippet of code I've thrown together for it so far is:


<snip attempt at manual directory recursion>

> My question, after all this explanation, is this: Am I barking up the
> wrong tree here, or am I just missing one little thing that will make
> all this work? If anyone else has a piece of code that will fulfil my
> requirements and make my life easier, you will have my undying
> gratitude, because at this point I'm seriously starting to reconsider
> scripting and just perform the conversions manually.


The standard (that is, included with your Perl distibution) module
File::Find is what you want to use to recurse through directories. Read
about it by typing the command
perldoc File::Find
at your shell prompt. The CPAN modules File::Finder and
File::Find::Rule also exist if you prefer an alternate syntax.

In the more general case, whenever you find yourself trying to do
something in Perl that has most likely done before (surely you don't
think you're the only one who's ever needed to recurse through a
directory structure, do you?), you should always check to see if a
module exists which already does it. Modules are stored and shared on
the CPAN, which you can search at http://search.cpan.org

Give File::Find a shot, and if you have problems with it, feel free to
ask for help.

Paul Lalli



 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      09-03-2004
Kloudnyne <(E-Mail Removed)> wrote in comp.lang.perl.misc:

[...]

> script, but I can't seem to work out how I can get it to recurse
> through the various subdirs.


You want File::Find (a standard module).

[code snipped]

> PS: I apologise for the hideous formatting. It's actually quite
> legible on a full-width screen, and I didn't want to disturb the text
> for fear of accidentally altering the code.


....so you left the formatting to Usenet, which really messed it up.

Anno
 
Reply With Quote
 
Joe Smith
Guest
Posts: n/a
 
      09-03-2004
Kloudnyne wrote:

> If anyone else has a piece of code that will fulfil my requirements
> and make my life easier, you will have my undying gratitude...


use File::Find;
sub process { print "Found file $_ in $File::Find::dir\n" if -f $_; }
find(\&process,'/tmp');

-Joe
 
Reply With Quote
 
Kloudnyne
Guest
Posts: n/a
 
      09-06-2004
Joe Smith <(E-Mail Removed)> wrote in message news:<P84_c.96904$9d6.59001@attbi_s54>...
<snip>

Thanks for your help. I apologise again for my blatantly obvious noobness.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Recursing macro preprocessing? Henrik Goldman C++ 4 10-22-2006 05:25 AM
Recursing for Progress Bar half.italian@gmail.com Python 4 09-19-2006 04:53 AM
StackOverFlowException When Recursing Page Controls Randy ASP .Net Web Controls 1 01-19-2006 05:02 AM
recursing through files in a folder Scott Carlson Python 3 10-01-2004 05:51 PM
Recursing code problem snowdy C Programming 19 09-02-2003 04:27 PM



Advertisments