Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Reading text file

Reply
Thread Tools

Reading text file

 
 
Kevin B
Guest
Posts: n/a
 
      10-16-2003
I have the following short script that I'm using to clean up the source of a
web page in order to index and search the page:

#!/usr/bin/perl
#striphtml.pl

undef $/;
open FD, "< testfile1.txt" or die $!;

while (<FD>) {
#s/\r\n//gs;

#s/^\s+$//;
s/<.*?>//gs;
trim();
print "$_";
}

sub trim {

my @out = @_ ? @_ : $_;
$_ = join(' ', split(' ')) for @out;
return wantarray ? @out : "@out";
}


the problem is that it leaves blank lines in the output and the use of chomp
does not clean up. What am I missing to clean up the lines?

Kevin



 
Reply With Quote
 
 
 
 
Roy Johnson
Guest
Posts: n/a
 
      10-16-2003
This newsgroup is defunct. You will reach more people if you post in
comp.lang.perl.misc instead.

"Kevin B" <(E-Mail Removed)> wrote in message news:<GlAjb.17261$(E-Mail Removed)>. ..
> undef $/;


Ok, you're slurping the whole file in at once...

> open FD, "< testfile1.txt" or die $!;
>
> while (<FD>) {


No real point in a while, if you're getting the whole file in one
read. Just do
$_ = <FD>;

> s/<.*?>//gs;


strip out all the tags...

> print "$_";


No need for the quotes. In this case, no need for an argument at all.
Just
print;

> the problem is that it leaves blank lines in the output and the use of chomp
> does not clean up. What am I missing to clean up the lines?


Maybe something like
tr/\n//s;
or
s/\n\s*\n/\n/g;
?
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Reading LAST line from text file without iterating through the file? Robin Wenger Java 191 03-26-2011 06:19 PM
Reading text file with wierd file extension? Lionel Python 22 02-03-2009 10:27 PM
UnauthorizedAccessException when reading XML files (no problem when reading other file-types) blabla120@gmx.net ASP .Net 0 09-15-2006 02:08 PM
reading from text file to excel file mail2atulmehta@yahoo.com C Programming 1 04-12-2005 07:50 PM
reading the DB vs. reading a text file...performance preference? Darrel ASP .Net 3 11-11-2004 02:27 PM



Advertisments