Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Search.cgi followup

Reply
Thread Tools

Search.cgi followup

 
 
Ken Saunders
Guest
Posts: n/a
 
      11-22-2004
first Thaks to all those folks that gave me pointers on the perl
script assignment I was doing. I managed to cobble it together. Now
that I've gotten it together I'm facing another challenge. the script
is working but won't return any search results. Here is the code can
someone tell me why I get no results page? I'm new at this and
already I've learned that Perl is really fun and amazingly frustrating
at the same time. Thanks

bnbliss


#!/usr/bin/perl
#this perl script performs a keyword search and displays the results
use strict;
use CGI qw(:standard);
use File::Find;

# Root directory of my website
my $filePath = '/home/classes/ksaund01/public_html';

print header;
print start_html(-bgcolor=>'lightblue');

if( param('criteria') ) {
find(\&search_file, $filePath);
}
else {
display_menu();
}

print end_html;

# Subroutines

sub search_file {

my $query = param('query');

if( $_ !~ /html|txt$/o ) {
return();
}

open(IN, "$_") or warn "Can't open $_: $!\n";

while ( my $line = <IN> ) {
chomp($line);

# Remove HTML from the line
$line =~ s/\<.*?\>//g;

# Cleanup filenames and turn it them
# a valid relative URL so that it can be uesd
# as a link
my $uri = $File::Find::name;
$uri =~ s/^$filePath//;
$uri = "/$uri";

if( $line =~ /$query/o ) {
print "<A HREF=$uri>$_</A><BR>";
}

}
close(IN);

}

sub display_menu {

print start_form,
b('Search this site for:'),
br,
textfield(-name=>'criteria'),
br,
submit(-name=>'Search'),
end_form;

}
 
Reply With Quote
 
 
 
 
Tad McClellan
Guest
Posts: n/a
 
      11-22-2004
Ken Saunders <(E-Mail Removed)> wrote:

> the script
> is working



Don't fix it if it isn't broken.


> but won't return any search results.



Errr, I guess you assign a rather strange definition to the
word "working" then...


> Here is the code can
> someone tell me why I get no results page?

^^^^^^^^^^^^^^^^^^^


What, exactly, does that mean?

The browser "hangs" and you never get *anything* back?

You get a web page, but it doesn' report finding any "hits"?

Have you looked for messages in your server log?

(Have you already looked for Perl FAQs that mention the CGI?

perldoc -q CGI
)


> use strict;



Good, but you should also add:

use warnings; # ask for all the help you can get!


> find(\&search_file, $filePath);


> sub search_file {
>
> my $query = param('query');
>
> if( $_ !~ /html|txt$/o ) {
> return();
> }



I'd switch the order of those 2 operations. There is no point in
fetching a param only to return() without using it.

The m//o does not do anything for the pattern you are using, so
it should not be there. Don't throw options on the end willy-nilly.
Either understand what they do for you, or don't use them yet.


Your pattern will match 'foo.html.bar' you know.

I _was_ just going to ask you to search for "precedence" in perlre.pod,
but that doesn't find docs that explain why it will match. The
right place is harder to find than it should be, but you can find
it by searching for "minimize confusion".

Your pattern says:
match "html" anywhere or "txt" at the end of string

as if you had written /html|(txt$)/.

I expect those are meant to be filename extensions, so you should
also require the dot before the extension.


You can say "unless" instead of "if not", which seems preferable
here (to me at least).


Phew! That's a lot of comments for only 4 lines of code.

So, you can replace those four lines with these two:

return unless /\.(html|txt)$/;
my $query = param('query');

If the query might contain regex metacharacters that you want to
match literally, then you'll want this instead:

my $query = quotemeta param('query');


> open(IN, "$_") or warn "Can't open $_: $!\n";



perldoc -q vars

What's wrong with always quoting "$vars"?

then:

open(IN, $_) or warn "Can't open $_: $!\n";


> while ( my $line = <IN> ) {
> chomp($line);
>
> # Remove HTML from the line
> $line =~ s/\<.*?\>//g;



perldoc -q HTML

How do I remove HTML from a string?

Which gives several examples of HTML that will mess things
up for the pattern you are using.

HTML tags may span across more than one line too.

Since these are your own files, you might be able to guarantee
that none of that "tricky stuff" will be present, but in general
you would need to do a Real Parse of the HTML to do it correctly.

Angle brackets are not meta in regular expressions, so there is
no need to backslash them.


> $uri =~ s/^$filePath//;
> $uri = "/$uri";



You can combine those 2 into a single substitution:

$uri =~ s/^$filePath/\//;

or, since you now have a slash character in your replacement string,
choose to use an alternate delimiter so that you won't need
any backslashing:

$uri =~ s#^$filePath#/#;


> if( $line =~ /$query/o ) {



We need 2 pieces of information to analyse why a pattern match
is not working correctly (the pattern and the string it is to
be matched against).

We have zero of those pieces of information, so we cannot help
explain why it is, or is not, matching...


> print "<A HREF=$uri>$_</A><BR>";



You really should put quotes around your attribute values.

Using an alternate form of double quoting helps to avoid
yet more backslashing:

print qq(<A HREF="$uri">$_</A><BR>);


--
Tad McClellan SGML consulting
http://www.velocityreviews.com/forums/(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
VPN problem followup, trying to connect PIX and Router cosmicspin@yahoo.com Cisco 1 09-02-2005 12:12 PM
How to followup for certificate Sachin Parab Microsoft Certification 3 08-19-2005 10:32 PM
USB vhdl code (followup) Rob Maris VHDL 3 08-08-2004 10:10 AM
followup: CGMP, IGMP Snooping, and the PIX Mike S. Whitlow Cisco 0 02-27-2004 05:43 PM
new acl ? followup Brian Bergin Cisco 0 11-21-2003 04:29 PM



Advertisments