Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > simple web site mapper

Reply
Thread Tools

simple web site mapper

 
 
Dan Jacobson
Guest
Posts: n/a
 
      07-24-2003
Say, does this working simple web site mapper (vs. Eric Raymond's
extra large version) have any stuffing hanging out that perl novice me
ought to fix? Say, how does one do '/bin/sh -e' in perl? Would
rewriting it in python be as easy? Does perl have an internal "ls"
that could be called as easy? File::Find couldn't give me the ls -R
order I prefer I suppose. Goal: to use even less lines of code.

use strict;
require HTML::HeadParser;
my $dir=<~/mywebsite>; #where files are on my computer
my $name="Bob Blobkowski";
my $url='http://blobkowski.org/';
chdir $dir||die;
print <<EOF;
<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd"><HTML><TITLE>$name\'s site map</TITLE>
</HEAD><BODY><H1>$name\'s site map</H1>\n<P><A href="$url">$url</A> as of
EOF
system "date"; print '</P><HR><PRE>';
my $p = HTML::HeadParser->new; my $d;
open LS, "ls -R|"||die;
while(<LS>){ #order nicer than find(1)
chomp;
if(s@:$@@){$d=$_;next}
if(/\.(txt|html)$/){s@^@$d/@;s/..//;print "<A href=\"$_\">$_</A>\n";
if(/\.html$/){$p->parse_file($_);print "\t",$p->header('Title'),"\n";}}}
print "</PRE></BODY></HTML>";
 
Reply With Quote
 
 
 
 
David K. Wall
Guest
Posts: n/a
 
      07-24-2003
Dan Jacobson <(E-Mail Removed)> wrote:

> Say, does this working simple web site mapper (vs. Eric Raymond's
> extra large version) have any stuffing hanging out that perl
> novice me ought to fix?


My comments below are about style, because that's what really got my
attention.

> Say, how does one do '/bin/sh -e' in
> perl? Would rewriting it in python be as easy?


I don't know much about Python, I've only played with it a bit. But
it would *force* you to use whitespace, which might not be a bad
thing. If you're interested, grab a copy and try it:
http://www.python.org

> Does perl have an
> internal "ls" that could be called as easy? File::Find couldn't
> give me the ls -R order I prefer I suppose. Goal: to use even
> less lines of code.


Forget using less lines of code unless it makes the program more
reliable and/or readable. If I had to revise the code below the very
first thing I would do would be to format it in a way that aids
comprehension instead of hindering it.

Ok, so it's not a long, complicated program. But even short programs
can benefit from a readable style. See 'perldoc perlstyle' for
suggestions.

> use strict;
> require HTML::HeadParser;
> my $dir=<~/mywebsite>; #where files are on my computer
> my $name="Bob Blobkowski";
> my $url='http://blobkowski.org/';
> chdir $dir||die;


There's a low-precedence 'or' that you can use, too.

chdir $dir or die "Cannot chdir to $dir: $!";


> print <<EOF;
><!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN"
> "http://www.w3.org/TR/html4/strict.dtd"><HTML><TITLE>$name\'s site

^^
It's not necessary to escape that single-quote after $name.

> map</TITLE>
></HEAD><BODY><H1>$name\'s site map</H1>\n<P><A href="$url">$url</A>
>as of
> EOF
> system "date"; print '</P><HR><PRE>';
> my $p = HTML::HeadParser->new; my $d;


$d? What's $d? I can figure it out from the code, but a more
descriptive name would be useful. If someone else needs to edit the
code they'll appreciate a longer name -- and you might, too, when you
come back to this program after a while and don't remember all the
details.

Newlines and spaces are not a scarce resource.

> open LS, "ls -R|"||die;
> while(<LS>){ #order nicer than find(1)
> chomp;
> if(s@:$@@){$d=$_;next}


Why use a non-standard delimiter for the substitution operator when
you don't need to? s/:$// is pretty clear; s@:$@@ is slightly
obfuscated, IMHO.

> if(/\.(txt|html)$/){s@^@$d/@;s/..//;print "<A
> href=\"$_\">$_</A>\n";


qq() would make that easier to read; you wouldn't need to escape the
double-quote each time. See 'perldoc perlop' for quote and quote-
like operators.

> if(/\.html$/){$p->parse_file($_);print
> "\t",$p->header('Title'),"\n";}}}
> print "</PRE></BODY></HTML>";
>


Just because you *can* eliminate whitespace doesn't mean you
*should*.

That may be why no-one else has responded. (No-one had when I posted
this, anyway)

As for bugs, I haven't looked -- I was distracted too much by the
style.

--
David Wall
 
Reply With Quote
 
 
 
 
Steve Grazzini
Guest
Posts: n/a
 
      07-24-2003
David K. Wall <(E-Mail Removed)> wrote:
> Dan Jacobson <(E-Mail Removed)> wrote:
>
> > Does perl have an internal "ls" that could be called as easy?
> > File::Find couldn't give me the ls -R order I prefer I suppose.


Have a look at File::Find's "preprocess" option.

% perldoc File::Find

> > use strict;
> > require HTML::HeadParser;
> > my $dir=<~/mywebsite>; #where files are on my computer
> > my $name="Bob Blobkowski";
> > my $url='http://blobkowski.org/';
> > chdir $dir||die;

>
> There's a low-precedence 'or' that you can use, too.
>
> chdir $dir or die "Cannot chdir to $dir: $!";


He'll *need* to use the low-precedence version or add some
parentheses.

> > open LS, "ls -R|"||die;


Same thing here:

% perl -MO=Deparse -e 'open LS, "ls -R|" || die'
open LS, 'ls -R|';

--
Steve
 
Reply With Quote
 
Jeff 'japhy' Pinyan
Guest
Posts: n/a
 
      07-24-2003
On Thu, 24 Jul 2003, David K. Wall wrote:

>Dan Jacobson <(E-Mail Removed)> wrote:
>
>> print <<EOF;
>><!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN"
>> "http://www.w3.org/TR/html4/strict.dtd"><HTML><TITLE>$name\'s site

> ^^
>It's not necessary to escape that single-quote after $name.


Yes it is. $name's is pre-Perl 5 syntax for $name::s. It's something
that bites many Perl programmers from time to time.

--
Jeff Pinyan RPI Acacia Brother #734 2003 Rush Chairman
"And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)

 
Reply With Quote
 
David K. Wall
Guest
Posts: n/a
 
      07-24-2003
Jeff 'japhy' Pinyan <(E-Mail Removed)> wrote:

> On Thu, 24 Jul 2003, David K. Wall wrote:
>
>>Dan Jacobson <(E-Mail Removed)> wrote:
>>
>>> print <<EOF;
>>><!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN"
>>> "http://www.w3.org/TR/html4/strict.dtd"><HTML><TITLE>$name\'s
>>> site

>> ^^
>>It's not necessary to escape that single-quote after $name.

>
> Yes it is. $name's is pre-Perl 5 syntax for $name::s. It's
> something that bites many Perl programmers from time to time.


Damn. I knew about the pre-Perl 5 syntax, but a quick
copy/paste/edit/run of code from the OP convinced me that it didn't
matter any more. But there are *two* places in the here-doc that use
an escape quote. I edited one and saw the output from the other.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
List of free web site design, web site backgrounds, web site layoutsresources cyber XML 1 12-25-2007 11:48 PM
List of free web site design, web site backgrounds, web site layoutsresources cyber HTML 0 12-21-2007 03:47 PM
List of free web site design, web site backgrounds, web site layoutsweb sites cyber HTML 1 12-19-2007 09:07 AM
Site mapper programs? LRW HTML 7 12-12-2003 04:51 AM
Re: Looking for web page site mapper Nico Schuyt HTML 1 07-13-2003 12:22 PM



Advertisments