Daniel Bergquist <> wrote:
> Consider the following chunk of code:
> --------------------------------------------------
> open (IN, "<:raw", "test2.txt") or die "Can't open test.txt";
>
> chomp($line = <IN>);
>
> # Capture excerpt
> $line =~ m/>([^<]+)/;
>
> # Copy first line of excerpt
> $pExcerpt = $1;
You should never use the dollar-digit variables unless you have
first ensured that the match *succeeded*, since the variables
are only changed when the match succeeds.
$pExcerpt = $1 if $line =~ m/>([^<]+)/;
( I hope you are not trying to parse HTML or XML with regular expressions...)
> # Next line
> chomp($line = <IN>);
>
> # Untill we have reached the end of the section
> until($line =~ m/<\/p>/i) {
You should use a module that understands HTML for processing HTML data.
> # Capture useful text
> $line =~ m/([^<]+)/;
The above line of code is useless. You don't put the captured text anywhere.
What do you think that pattern match is doing for you?
> chomp($line = <IN>);
> }
>
> # Capture the rest of the useful text
> $line =~ m/([^<]+)/;
>
> $pExcerpt = "$pExcerpt $1";
$pExcerpt .= " $1";
> Why does it not work the first
> way(which is the way I need it)?
It won't matter if you process it properly (with an HTML module rather
than with regexes).
--
Tad McClellan SGML consulting
Perl programming
Fort Worth, Texas