Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Trim whitespace with cookbook recipe does not result in trimmed array

Reply
Thread Tools

Trim whitespace with cookbook recipe does not result in trimmed array

 
 
io
Guest
Posts: n/a
 
      02-12-2006
Kind folks of Perlidom-

My intent is to slurp a big text file (say, a chapter from the
English literature). I then want to trim all the white space and
newlines, so I get an array of compact text. I've looked into the
Perl Cookbook, and came up with this.
However, it doesn't work. The array slurped up has as many spaces as
the original. In fact, it looks the same.

####################################
#!/usr/bin/perl
use warnings;

open INPUT, "textfile" or die $1;
my @data;

while(<INPUT>) {
my $fh = <INPUT>;
chomp $fh;
$fh =~ s/^\s+//;
$fh =~ s/\+$//;
push @data, $fh;
}

print @data, "\n";
###################################

Is the issue the print statement? Is it the stream? Is it a scope
issue with push @data?

I don't get it! Please help...

TIA.



 
Reply With Quote
 
 
 
 
usenet@DavidFilmer.com
Guest
Posts: n/a
 
      02-12-2006
io wrote:

> #!/usr/bin/perl
> use warnings;


Using warnings() is good, but it's only part of what's really important
here - you forgot to use strict(); which is THE most important
statement in any Perl program.

>
> open INPUT, "textfile" or die $1;


What do you think $1 will contain here? Check out

perldoc perlvar

and look for variables related to "error" (especially $!)

> my @data;
>
> while(<INPUT>) {
> my $fh = <INPUT>;


Make up your mind how you want to read the file. Each time you say
<INPUT> you read from the file. I think you want to leave off the
second statement completely and just do:

while (my $fh = <INPUT>) {

($fh is a terrible name for the variable, BTW, since it usually means
"file handle" which it is not - the filehandle here is called *INPUT).

> chomp $fh;
> $fh =~ s/^\s+//;


OK, fine, strip off trailing linefeed and leading whitespace.

> $fh =~ s/\+$//;


huh? Did you cut-and-paste that, or are you re-typing code into usenet
(bad idea). If that is actual cut-and-paste, be advised that you are
removing all trailing plus signs.

> push @data, $fh;
> }
>
> print @data, "\n";
> ###################################


Your main problem is the two <INPUT> reads. Secondary problem is the
faulty second s/// statement.

--
http://DavidFilmer.com

 
Reply With Quote
 
 
 
 
it_says_BALLS_on_your_forehead
Guest
Posts: n/a
 
      02-12-2006

http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> io wrote:
>
> > #!/usr/bin/perl
> > use warnings;

>
> Using warnings() is good, but it's only part of what's really important
> here - you forgot to use strict(); which is THE most important
> statement in any Perl program.
>
> >
> > open INPUT, "textfile" or die $1;

>
> What do you think $1 will contain here? Check out
>
> perldoc perlvar
>
> and look for variables related to "error" (especially $!)
>
> > my @data;
> >
> > while(<INPUT>) {
> > my $fh = <INPUT>;

>
> Make up your mind how you want to read the file. Each time you say
> <INPUT> you read from the file. I think you want to leave off the
> second statement completely and just do:
>
> while (my $fh = <INPUT>) {
>
> ($fh is a terrible name for the variable, BTW, since it usually means
> "file handle" which it is not - the filehandle here is called *INPUT).
>
> > chomp $fh;
> > $fh =~ s/^\s+//;

>
> OK, fine, strip off trailing linefeed and leading whitespace.

actually, this is trimming *leading* whitespace...

 
Reply With Quote
 
it_says_BALLS_on_your_forehead
Guest
Posts: n/a
 
      02-12-2006

it_says_BALLS_on_your_forehead wrote:
> (E-Mail Removed) wrote:
> > io wrote:
> >
> > > #!/usr/bin/perl
> > > use warnings;

> >
> > Using warnings() is good, but it's only part of what's really important
> > here - you forgot to use strict(); which is THE most important
> > statement in any Perl program.
> >
> > >
> > > open INPUT, "textfile" or die $1;

> >
> > What do you think $1 will contain here? Check out
> >
> > perldoc perlvar
> >
> > and look for variables related to "error" (especially $!)
> >
> > > my @data;
> > >
> > > while(<INPUT>) {
> > > my $fh = <INPUT>;

> >
> > Make up your mind how you want to read the file. Each time you say
> > <INPUT> you read from the file. I think you want to leave off the
> > second statement completely and just do:
> >
> > while (my $fh = <INPUT>) {
> >
> > ($fh is a terrible name for the variable, BTW, since it usually means
> > "file handle" which it is not - the filehandle here is called *INPUT).
> >
> > > chomp $fh;
> > > $fh =~ s/^\s+//;

> >
> > OK, fine, strip off trailing linefeed and leading whitespace.

> actually, this is trimming *leading* whitespace...

....which is exactly what you said. perhaps i should learn to read.

 
Reply With Quote
 
it_says_BALLS_on_your_forehead
Guest
Posts: n/a
 
      02-12-2006

io wrote:
> Kind folks of Perlidom-
>
> My intent is to slurp a big text file (say, a chapter from the
> English literature). I then want to trim all the white space and
> newlines, so I get an array of compact text. I've looked into the
> Perl Cookbook, and came up with this.
> However, it doesn't work. The array slurped up has as many spaces as
> the original. In fact, it looks the same.
>
> ####################################
> #!/usr/bin/perl
> use warnings;
>
> open INPUT, "textfile" or die $1;


i think you missed the shift key

open INPUT, "textfile" or die $!;

although really this should be:

my $file = 'textfile';
open ( my $fh, '<', $file ) or die "can't open $file: $!\n";
# now s/INPUT/\$fh/g;

> my @data;
>
> while(<INPUT>) {
> my $fh = <INPUT>;
> chomp $fh;
> $fh =~ s/^\s+//;
> $fh =~ s/\+$//;


i think you mean:
$fh =~ s/\s+$//; # i don't think you need the chomp if you're doing
this...

> push @data, $fh;
> }
>
> print @data, "\n";
> ###################################
>
> Is the issue the print statement? Is it the stream? Is it a scope
> issue with push @data?
>
> I don't get it! Please help...


technically, the above code does not 'slurp'. you are performing
line-by-line processing. check out Uri Guttman's article on slurping:
http://www.perl.com/pub/a/2003/11/21/slurp.html

 
Reply With Quote
 
io
Guest
Posts: n/a
 
      02-12-2006
(E-Mail Removed) writes:

> io wrote:
>
> What do you think $1 will contain here? Check out
>
> Your main problem is the two <INPUT> reads. Secondary problem is the
> faulty second s/// statement.
>


Hi --

I undertook your recommendations. I still get an unmodified array when I print it.
I'm really confused as to why no detructive modification was made to
the array. I stil get an output with spaces.
Maybe I formulated the problem in a bad way...I'd like to have *no*
spaces between characters, as well as no carriage return (Unix,
Windows, etc). Could taht be the problem?
I'll 'fess up that I don't grok regexes yet.


#!/usr/bin/perl
use warnings;
use strict;

open INPUT, "textfile" or die $!;
my @data;

my @element;
while(my $file = <INPUT>) { # filehandle in *INPUT
# chomp $file; # don't need the chomp because if the second regex
$file =~ s/^\s+//;
push @data, $file;
$file =~ s/\s+$//;
push @data, $file;


my $element;
foreach $element (@data) {
print $element;
}
}


Any ideas anyone?

TIA.
 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      02-12-2006
io wrote:
> I undertook your recommendations. I still get an unmodified array when I print it.


You have not made any attempt at modifying an array. What part of your
code did you think was doing this?

> I'm really confused as to why no detructive modification was made to
> the array. I stil get an output with spaces.
> Maybe I formulated the problem in a bad way...I'd like to have *no*
> spaces between characters, as well as no carriage return (Unix,
> Windows, etc). Could taht be the problem?


What problem?

> I'll 'fess up that I don't grok regexes yet.


Your problem is that you are copy and pasting code that you've seen
elsewhere without knowing what it does.

> #!/usr/bin/perl
> use warnings;
> use strict;
>
> open INPUT, "textfile" or die $!;
> my @data;
>
> my @element;
> while(my $file = <INPUT>) { # filehandle in *INPUT
> # chomp $file; # don't need the chomp because if the second regex
> $file =~ s/^\s+//;


This removes all space from the BEGINING OF THE LINE

> push @data, $file;


This adds the current line to @data.

> $file =~ s/\s+$//;


This removes all space from the END OF THE LINE

> push @data, $file;


This adds the SAME line, now with ending spaces removed, to @data.
>
> my $element;
> foreach $element (@data) {


This loops through each line that you put into @data (which you did
twice for each line)

> print $element;


This prints each element, one per loop

> }


Your for loop is inside your while loop. You are printing the entire
contents of @data once for every line of the file, meaning you are
getting output similar to:
line 1
line 1
line 2
line 1
line 2
line 3
line 1
line 2
line 3
line 4
<etc>

> }
>
>
> Any ideas anyone?


Yes. Learn. Do not simply copy and paste. Make an effort to
understand what your program is doing.

Go read:
perldoc perlretut
perldoc perlre
perldoc perlreref

To remove *all* spaces from a scalar value, regardless of where:

$file =~ s/\s+//g;


Please go read the Posting Guidelines for this group, and follow their
advice. Specifically, show your sample input, your desired output, and
your actual output. Post a *self-contained* program, using the
__DATA__ marker and <DATA> pseudo-filehandle, rather than just asking
us to accept the fact that you have an input file that you've opened
for reading.

Paul Lalli

 
Reply With Quote
 
Ch Lamprecht
Guest
Posts: n/a
 
      02-12-2006
io wrote:
> (E-Mail Removed) writes:
>
>
>>io wrote:
>>
>>What do you think $1 will contain here? Check out
>>
>>Your main problem is the two <INPUT> reads. Secondary problem is the
>>faulty second s/// statement.
>>

>
>
> Hi --
>
> I undertook your recommendations. I still get an unmodified array when I print it.
> I'm really confused as to why no detructive modification was made to
> the array. I stil get an output with spaces.
> Maybe I formulated the problem in a bad way...I'd like to have *no*
> spaces between characters, as well as no carriage return (Unix,
> Windows, etc).


Hi,
I can hardly believe that this really is what you want:
No spaces, no newlines...

use warnings;
use strict;

my $text;

while(my $file = <DATA>) {
$file =~ s/\s+//g;
$text.=$file;
}
print $text;

__DATA__
Hi --

I undertook your recommendations. I still get an unmodified array when
I print it.
I'm really confused as to why no detructive modification was made to
the array. I stil get an output with spaces.
Maybe I formulated the problem in a bad way...I'd like to have *no*
spaces between characters, as well as no carriage return (Unix,
Windows, etc). Could taht be the problem?
I'll 'fess up that I don't grok regexes yet.

--

perl -e "print scalar reverse q/(E-Mail Removed)/"
 
Reply With Quote
 
it_says_BALLS_on_your_forehead
Guest
Posts: n/a
 
      02-12-2006

Ch Lamprecht wrote:
> io wrote:
> > (E-Mail Removed) writes:
> >
> >
> >>io wrote:
> >>
> >>What do you think $1 will contain here? Check out
> >>
> >>Your main problem is the two <INPUT> reads. Secondary problem is the
> >>faulty second s/// statement.
> >>

> >
> >
> > Hi --
> >
> > I undertook your recommendations. I still get an unmodified array when I print it.
> > I'm really confused as to why no detructive modification was made to
> > the array. I stil get an output with spaces.
> > Maybe I formulated the problem in a bad way...I'd like to have *no*
> > spaces between characters, as well as no carriage return (Unix,
> > Windows, etc).

>
> Hi,
> I can hardly believe that this really is what you want:
> No spaces, no newlines...
>
> use warnings;
> use strict;
>
> my $text;
>
> while(my $file = <DATA>) {
> $file =~ s/\s+//g;
> $text.=$file;
> }
> print $text;
>


i agree. i thought about suggesting s/\s+//g, but that wouldn't really
be *trimming* whitespace, that would be removing it altogether.

what we need is a more exact definition of your problem/goal.

 
Reply With Quote
 
SomeDude
Guest
Posts: n/a
 
      02-13-2006
Em Sun, 12 Feb 2006 18:50:16 -0200, io escreveu:

> Hi --
>
> I undertook your recommendations. I still get an unmodified array when I print it.
> I'm really confused as to why no detructive modification was made to



Thanks for your answers guys.
Just a note: don't go assuming a newbie hasn't read.
The semantics of the while loop and filehandles is not trivial (eg., when
implicit atribution happens to global $_) and I lost a good chunk of my
afternoon in the Camel book trying to understand and test some things.

Please look up DATA in the Camel book and check what page
it is on. You can't expect a newbie to know that. If you expect that
messages to be posted with data "in place" then don't complain to newbies,
modify the Posting Guidelines, where no recommendation to <DATA> and
__DATA__ can be found. Those are on Chapter 10 of a very thick Perl book
(Ed Peschko's).

http://groups.google.de/group/comp.l...5d6f2ea37a3190

Yes, the regex code was pasted from the Cookbook, that's what it's for,
but my English comprehension and my limited regex got me stuck.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 2 04-24-2007 01:59 AM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 26 02-26-2007 05:06 PM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 6 12-25-2006 08:47 PM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 0 10-25-2006 11:00 PM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 0 08-28-2006 11:00 PM



Advertisments