Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Perl (http://www.velocityreviews.com/forums/f17-perl.html)
-   -   Reading text file (http://www.velocityreviews.com/forums/t24499-reading-text-file.html)

Kevin B 10-16-2003 05:14 PM

Reading text file
 
I have the following short script that I'm using to clean up the source of a
web page in order to index and search the page:

#!/usr/bin/perl
#striphtml.pl

undef $/;
open FD, "< testfile1.txt" or die $!;

while (<FD>) {
#s/\r\n//gs;

#s/^\s+$//;
s/<.*?>//gs;
trim();
print "$_";
}

sub trim {

my @out = @_ ? @_ : $_;
$_ = join(' ', split(' ')) for @out;
return wantarray ? @out : "@out";
}


the problem is that it leaves blank lines in the output and the use of chomp
does not clean up. What am I missing to clean up the lines?

Kevin




Roy Johnson 10-16-2003 10:14 PM

Re: Reading text file
 
This newsgroup is defunct. You will reach more people if you post in
comp.lang.perl.misc instead.

"Kevin B" <karigna@verizon.net> wrote in message news:<GlAjb.17261$zw4.10936@nwrdny01.gnilink.net>. ..
> undef $/;


Ok, you're slurping the whole file in at once...

> open FD, "< testfile1.txt" or die $!;
>
> while (<FD>) {


No real point in a while, if you're getting the whole file in one
read. Just do
$_ = <FD>;

> s/<.*?>//gs;


strip out all the tags...

> print "$_";


No need for the quotes. In this case, no need for an argument at all.
Just
print;

> the problem is that it leaves blank lines in the output and the use of chomp
> does not clean up. What am I missing to clean up the lines?


Maybe something like
tr/\n//s;
or
s/\n\s*\n/\n/g;
?


All times are GMT. The time now is 08:26 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.