Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Trim whitespace with cookbook recipe does not result in trimmed array

Reply
Thread Tools

Trim whitespace with cookbook recipe does not result in trimmed array

 
 
Tad McClellan
Guest
Posts: n/a
 
      02-13-2006
io <(E-Mail Removed)> wrote:


> My intent is to slurp a big text file (say, a chapter from the
> English literature). I then want to trim all the white space and
> newlines,



Here you say _all_ whitespace, but your code appears to be trying
to delete only leading and trailing whitespace.

Which is it?

The value of the implementation is directly proportional to
the value of the specification you know.


> so I get an array of compact text.



my @data = map { s/^\s+//; s/\s+$// } <INPUT>; # untested


> The array slurped up has as many spaces as
> the original. In fact, it looks the same.

^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^

I am afraid that I do not believe you...


> ####################################
> #!/usr/bin/perl
> use warnings;
>
> open INPUT, "textfile" or die $1;

^
^
You missed the SHIFT key there.

It should be $! not $1.


> while(<INPUT>) {



Here you read a line into the $_ variable, but you never output
it, so you should be missing every other line in your output,
ie. it won't "look the same" as the input file.


> my $fh = <INPUT>;



Here you read a 2nd line (you read 2 lines for each loop iteration).

I'd say that $fh is a really poor choice of name for something
that is not a filehandle.


> $fh =~ s/\+$//;

^^
^^

Here you miss an "s".

Please take more care in composing your posts, it is wasteful
to submit the wrong stuff to hundreds of people.


> Is the issue the print statement?



No.


> Is it the stream?



No.


> Is it a scope
> issue with push @data?



No.


> I don't get it!



Perl is doing exactly what you told it to do.


> Please help...



Tell Perl to do something else instead.


--
Tad McClellan SGML consulting
http://www.velocityreviews.com/forums/(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
 
 
 
Uri Guttman
Guest
Posts: n/a
 
      02-13-2006
>>>>> "TM" == Tad McClellan <(E-Mail Removed)> writes:


TM> my @data = map { s/^\s+//; s/\s+$// } <INPUT>; # untested

tad, i ashamed of you for posting that line. you know the map block
returns its last value and not the original in $_? and even though that
is not in void context, i eschew any side effects in map/grep as they
are meant to be functional in style. so i would slurp first and trim
later:

use File::Slurp

my @data = read_file( 'whatever_file' ) ;
s/^\s+//, s/\s+$// for @data ;

>> The array slurped up has as many spaces as
>> the original. In fact, it looks the same.

TM> ^^^^^^^^^^^^^^^^^
TM> ^^^^^^^^^^^^^^^^^

TM> I am afraid that I do not believe you...

i don't either. since he is not being clear about his goal of removing
whitespace how could we trust his opinion of bad output? we don't even
have a proper spec to test against.

>> Please help...


TM> Tell Perl to do something else instead.

and be accurate in telling us what you actually want done (best with
input and expected output examples) and why you think it is not
working. otherwise we are doing brain surgery on you while wearing
boxing gloves. do you want that?

uri

--
Uri Guttman ------ (E-Mail Removed) -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
 
Reply With Quote
 
 
 
 
A. Sinan Unur
Guest
Posts: n/a
 
      02-13-2006
SomeDude <(E-Mail Removed)> wrote in
news(E-Mail Removed):

> Em Sun, 12 Feb 2006 18:50:16 -0200, io escreveu:
>
>> Hi --
>>
>> I undertook your recommendations. I still get an unmodified array
>> when I print it. I'm really confused as to why no detructive
>> modification was made to


....

> If you expect that
> messages to be posted with data "in place" then don't complain to
> newbies, modify the Posting Guidelines, where no recommendation to
> <DATA> and __DATA__ can be found.


That is a blatant lie. See the subsection with the title "Provide enough
information".

*PLONK*

Sinan

--
A. Sinan Unur <(E-Mail Removed)>
(reverse each component and remove .invalid for email address)

comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/cl...uidelines.html

 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      02-13-2006
SomeDude wrote:
>
> Thanks for your answers guys.
> Just a note: don't go assuming a newbie hasn't read.


That is the only possible assumption when "a newbie" gives no
indication otherwise.

> The semantics of the while loop and filehandles is not trivial (eg., when
> implicit atribution happens to global $_) and I lost a good chunk of my
> afternoon in the Camel book trying to understand and test some things.
>
> Please look up DATA in the Camel book and check what page
> it is on.


Okay. Let's see, the index of my Camel book points has an entry "DATA
filehandle" which points me to the page on special variables, which
says:

DATA

[PKG] This special filehandle refers to anything following either
the __END__ token or the __DATA__ token in the current file. The
__END__ token always opens the main:ATA filehandle, and so is used in
the main program. The __DATA__ token opens the DATA handle in whichever
package is in effect at the time, so different modules can each have
their own DATA filehandle, since they (presumably) have different
package names.

> You can't expect a newbie to know that.


I beg to differ.

> If you expect that
> messages to be posted with data "in place" then don't complain to newbies,
> modify the Posting Guidelines, where no recommendation to <DATA> and
> __DATA__ can be found.


Ahem. You are either mistaken or outright lying. Go read the Posting
Guidelines again.

Tad McClellan wrote (hundreds of times) :
> Describe *precisely* the input to your program. Also provide example
> input data for your program. If you need to show file input, use the
> __DATA__ token (perldata.pod) to provide the file contents inside of
> your Perl program.


It tells you what to do, and gives you the pointer to precisely where
__DATA__ is described.

> Those are on Chapter 10 of a very thick Perl book
> (Ed Peschko's).


Never heard of him. Perhaps you need a better book. And why are you
talking about this book when you just asked us to go look it up in the
Camel?

> http://groups.google.de/group/comp.l...5d6f2ea37a3190
>
> Yes, the regex code was pasted from the Cookbook, that's what it's for,


No, it's really not. It's for helping you understand how to make your
own Perl programs. It is not for blind copy and pastes.

This was an excellent way to get yourself plonked by many of the most
knowledgeable and helpful people in this newsgroup, by the way. Your
response was a very unfortunate choice to have made. The correct
response was "Oh, I'm sorry, I'll go re-read the Posting Guidelines and
fix my posts in the future."

Fare thee well,
Paul Lalli

 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      02-13-2006
io <(E-Mail Removed)> wrote:

> I'd like to have *no*
> spaces between characters,


> I'll 'fess up that I don't grok regexes yet.



That's OK, because you do not need regexes to accomplish that task:

$str =~ tr/ \n\r\f\t//d; # delete _all_ whitespace characters


There is no regular expression there.


--
Tad McClellan SGML consulting
(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
SomeDude
Guest
Posts: n/a
 
      02-14-2006

>
> *PLONK*
>
> Sinan


Yeah, OK, my bad here it is:

Describe *precisely* the input to your program. Also provide example
input data for your program. If you need to show file input, use the
__DATA__ token (perldata.pod) to provide the file contents inside of
your Perl program.

It was like a foreign language to me. I don't think that was well written.
A little example might've gone a long way.





 
Reply With Quote
 
SomeDude
Guest
Posts: n/a
 
      02-14-2006
Em Mon, 13 Feb 2006 04:59:48 -0800, Paul Lalli escreveu:

> SomeDude wrote:


>
>> The semantics of the while loop and filehandles is not trivial (eg., when
>> implicit atribution happens to global $_) and I lost a good chunk of my
>> afternoon in the Camel book trying to understand and test some things.
>>
>> Please look up DATA in the Camel book and check what page
>> it is on.

>
> Okay. Let's see, the index of my Camel book points has an entry "DATA
> filehandle" which points me to the page on special variables, which
> says:
>

Which is page? Get serious. Or modify the Posting Guidelines. The way
it's written would get one to flunk essay writings in college. IMHO.
IT's called communicating clearly, and it goes a long, long way in
corporations and other well-paying jobs.

Anyways, this is what I wanted to do. I decided to use a suggestion that
used a string, it was faster than any modification I would've done, plus
I can use unpack easily.

Cheers and thanks very much and sorry for any misunderstandings.


#!/usr/bin/perl
use warnings;
use strict;

# When not using __DATA__
#open INPUT, "textfile" or die $!;

my $text;
my @chars;
my @array;

while(my $file = <DATA>) {
$file =~ s/\s+//g; # All whitespaces and newlines removed globally
$text.=$file;
}
print $text;
print "\n---------------------------------------\n";

# Takes $string and throws it into an array of ASCII values"
@array=unpack("C*", $text);
print "@array\n";


# turn string into an array;
# all characters are spearated by a space
@chars = split //, $text;
# just print it;
print "\n--------------------------------------\n";
print "@chars\n";



__DATA__
Deuteronomy, chapter 2


Compare with Revised Standard Version: Deut.02



1: Then we turned, and took our journey into the wilderness by the way of the Red sea, as the LORD spake unto me: and we compassed mount Seir many days.
2: And the LORD spake unto me, saying,
3: Ye have compassed this mountain long enough: turn you northward.
4: And command thou the people, saying, Ye are to pass through the coast of your brethren the children of Esau, which dwell in Seir; and they shall be afraid of you: take ye good heed unto yourselves therefore:



 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      02-14-2006
SomeDude <(E-Mail Removed)> wrote:

> Just a note: don't go assuming a newbie hasn't read.



So you have so much experience with newbie postings that
you can conclude that with confidence? (rhetorical question)

People are likely to assume the most common case, whether you ask
them to assume the rare case or not.

It isn't your fault that the overwhelming majority of newbies
post before attempting any reading, but you still get to take
the heat.

It is unfortunate, but it is also the reality.


> Please look up DATA in the Camel book



I don't care what the Camel book says, it is only backup, it is
not the authority. I care what the real authority says, namely
the standard docs that ship with perl.

If you are programming in Perl, then you have surely read about
the data types available in the language.

The DATA token is described about halfway through perldata.pod.


> You can't expect a newbie to know that.



I can expect the newbie to go away, read it, and then come back though.


> If you expect that
> messages to be posted with data "in place" then don't complain to newbies,
> modify the Posting Guidelines, where no recommendation to <DATA> and
> __DATA__ can be found.



... If you need to show file input, use the __DATA__
token (perldata.pod) ...

That looks like both a recommendation and a reference to me.


--
Tad McClellan SGML consulting
(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
Matt Garrish
Guest
Posts: n/a
 
      02-14-2006

"SomeDude" <(E-Mail Removed)> wrote in message
news(E-Mail Removed)...
> Em Mon, 13 Feb 2006 04:59:48 -0800, Paul Lalli escreveu:
>
>>
>> Okay. Let's see, the index of my Camel book points has an entry "DATA
>> filehandle" which points me to the page on special variables, which
>> says:
>>

> Which is page? Get serious. Or modify the Posting Guidelines. The way
> it's written would get one to flunk essay writings in college. IMHO.
> IT's called communicating clearly, and it goes a long, long way in
> corporations and other well-paying jobs.
>


Hmm, pot calling the kettle black...

Matt


 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      02-14-2006
SomeDude wrote:
> Em Mon, 13 Feb 2006 04:59:48 -0800, Paul Lalli escreveu:
>
> > SomeDude wrote:


> >> Please look up DATA in the Camel book and check what page
> >> it is on.

> >
> > Okay. Let's see, the index of my Camel book points has an entry "DATA
> > filehandle" which points me to the page on special variables, which
> > says:
> >

> Which is page?


Why would anyone care what page something in the Camel is on? The
Camel is a refernence. Do you care about on what page your entry is
found when looking in an encyclopedia or a dictionary?

> Get serious. Or modify the Posting Guidelines.


The Posting Guidelines - at the very least this section of them - are
perfectly fine. They tell you what to do, and if you don't understand
the instruction, they tell you precisely where to get more information
about the instruction. The fact that you didn't bother reading or
reading carefully enough is your fault, not the Guidelines.

> The way
> it's written would get one to flunk essay writings in college. IMHO.


Fortunately, the guidelines are not an essay.

> IT's called communicating clearly, and it goes a long, long way in
> corporations and other well-paying jobs.


Gee. I guess I imagined going to my job today, working at a national
banking corporation.

> Anyways, this is what I wanted to do. I decided to use a suggestion that
> used a string, it was faster than any modification I would've done, plus
> I can use unpack easily.
>
> Cheers and thanks very much and sorry for any misunderstandings.
>
>
> #!/usr/bin/perl
> use warnings;
> use strict;
>
> # When not using __DATA__
> #open INPUT, "textfile" or die $!;
>
> my $text;
> my @chars;
> my @array;
>
> while(my $file = <DATA>) {
> $file =~ s/\s+//g; # All whitespaces and newlines removed globally
> $text.=$file;
> }


Why are you using four lines when two will do? More explicitly, why
are you forcing Perl to do all these consecutive reads and
substitutions? Why not just one?

my $file = do { local $/; <DATA>};
$file =~ s/\s+//g;

> print $text;
> print "\n---------------------------------------\n";
>
> # Takes $string and throws it into an array of ASCII values"
> @array=unpack("C*", $text);
> print "@array\n";
>
>
> # turn string into an array;
> # all characters are spearated by a space
> @chars = split //, $text;
> # just print it;
> print "\n--------------------------------------\n";
> print "@chars\n";


If you had ever once *said* that was your goal, it's quite likely
someone would have helped you acheive that result about 5 posts ago.

Go read the Posting Guidelines. Again.

Paul Lalli

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 2 04-24-2007 01:59 AM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 26 02-26-2007 05:06 PM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 6 12-25-2006 08:47 PM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 0 10-25-2006 11:00 PM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 0 08-28-2006 11:00 PM



Advertisments