Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > split inconsistency- why?

Reply
Thread Tools

split inconsistency- why?

 
 
Sara
Guest
Posts: n/a
 
      08-09-2004
split /,/,'cat,mouse,,eel,fish';

yields
0 'cat'
1 'mouse'
2 ''
3 'eel'
4 'fish'

Fine. Likewise:

split /,/,',,,cat,,mouse,eel,fish';

yields
0 ''
1 ''
2 ''
3 'cat'
4 ''
5 'mouse'
6 'eel'
7 'fish'

Everybody is happy. But

split /,/,'cat,mouse,eel,fish,,,';

yields
0 'cat'
1 'mouse'
2 'eel'
3 'fish'

Huh? Where did the trailing items go?

I work around this inconsistency by adding in "placeholders" like:

s/,,/,#,/g; s/,,/,#,/g;

then do the split,then remove the #'s. What a treat. Thanks Larry!

But I can't help but wonder- what programming advantage does this
offer and why was split designed to ignore some split candidates such
as these trailing items? And why only omit trailing items, and not
leading? Was there some presumption made about leading ones being more
meaningful than trailing? Very odd presumption if so!

To the programmer, it would be easier to make split consistent, and in
those cases when the programmer doesn't want empty trailing items he
can easily prepare the scalar to get rid of them:

s/,+$//;

which is a lot easier than identifiying a unique placeholder-
inserting it, splitting, then removing it.

Perl is pretty much self-consistent, in fact this is one of very few
cases I've encountered which lacks consistency. I'd be interested
though in hearing the arguments on why this was a beneficial language
design choice?
 
Reply With Quote
 
 
 
 
Thomas Kratz
Guest
Posts: n/a
 
      08-09-2004
Sara wrote:
> split /,/,'cat,mouse,,eel,fish';
>
> yields
> 0 'cat'
> 1 'mouse'
> 2 ''
> 3 'eel'
> 4 'fish'
>
> Fine. Likewise:
>
> split /,/,',,,cat,,mouse,eel,fish';
>
> yields
> 0 ''
> 1 ''
> 2 ''


one '' too many

> 3 'cat'
> 4 ''
> 5 'mouse'
> 6 'eel'
> 7 'fish'
>
> Everybody is happy. But
>
> split /,/,'cat,mouse,eel,fish,,,';
>
> yields
> 0 'cat'
> 1 'mouse'
> 2 'eel'
> 3 'fish'
>
> Huh? Where did the trailing items go?


perldoc -f split

Look for the description of LIMIT

>
> I work around this inconsistency by adding in "placeholders" like:
>
> s/,,/,#,/g; s/,,/,#,/g;
>
> then do the split,then remove the #'s. What a treat. Thanks Larry!


That is uncalled for. You just have to read the docs.

[rest snipped]

Thomas

--
open STDIN,"<&DATA";$=+=14;$%=50;while($_=(seek( #J~.> a>n~>>e~.......>r.
STDIN,$:*$=+$,+$%,0),getc)){/\./&&last;/\w| /&&( #.u.t.^..oP..r.>h>a~.e..
print,$_=$~);/~/&&++$:;/\^/&&--$:;/>/&&++$,;/</ #.>s^~h<t< ..~. ...c.^..
&&--$,;$:%=4;$,%=23;$~=$_;++$i==1?++$,:_;}__END__#.... >>e>r^..>l^...>k^..
 
Reply With Quote
 
 
 
 
John W. Krahn
Guest
Posts: n/a
 
      08-09-2004
Sara wrote:
> split /,/,'cat,mouse,,eel,fish';
>
> yields
> 0 'cat'
> 1 'mouse'
> 2 ''
> 3 'eel'
> 4 'fish'
>
> Fine. Likewise:
>
> split /,/,',,,cat,,mouse,eel,fish';
>
> yields
> 0 ''
> 1 ''
> 2 ''
> 3 'cat'
> 4 ''
> 5 'mouse'
> 6 'eel'
> 7 'fish'
>
> Everybody is happy. But
>
> split /,/,'cat,mouse,eel,fish,,,';
>
> yields
> 0 'cat'
> 1 'mouse'
> 2 'eel'
> 3 'fish'
>
> Huh? Where did the trailing items go?


If you had read the documentation for split you would see that that
is the defined behavior for zero, one or two argument versions of split.

perldoc -f split
split /PATTERN/,EXPR,LIMIT
split /PATTERN/,EXPR
split /PATTERN/
split Splits a string into a list of strings and returns that
list. By default, empty leading fields are preserved,
and empty trailing ones are deleted.


You will also see that there is a third argument "LIMIT" which will solve
your problem.



John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      08-09-2004
Sara wrote:
>
> split /,/,'cat,mouse,eel,fish,,,';
>
> yields
> 0 'cat'
> 1 'mouse'
> 2 'eel'
> 3 'fish'
>
> Huh? Where did the trailing items go?


Inconsistency or not, it's documented in the first para in

perldoc -f split

> I work around this inconsistency by adding in "placeholders" like:
>
> s/,,/,#,/g; s/,,/,#,/g;
>
> then do the split,then remove the #'s. What a treat. Thanks Larry!


Use grep() to exclude any empty fields:

grep length, split /,/,',,,cat,,mouse,eel,fish';

> To the programmer, it would be easier to make split consistent,


Are you suggesting a change of this well documented behaviour? Seems
not advisable to me.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
 
Reply With Quote
 
Jeff 'japhy' Pinyan
Guest
Posts: n/a
 
      08-09-2004
On 9 Aug 2004, Sara wrote:

>But I can't help but wonder- what programming advantage does this
>offer and why was split designed to ignore some split candidates such
>as these trailing items? And why only omit trailing items, and not
>leading? Was there some presumption made about leading ones being more
>meaningful than trailing? Very odd presumption if so!

[snip]
>Perl is pretty much self-consistent, in fact this is one of very few
>cases I've encountered which lacks consistency. I'd be interested
>though in hearing the arguments on why this was a beneficial language
>design choice?


You seem to be very argumentative, or at least passionate, about this
issue, but as has been explained to you, you did not research the problem
at all. The answer was simply in the documentation of the function you
are using. Please try not to be so inflammatory in the future.

--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
RPI Corporation Secretary % have long ago been overpaid?
http://japhy.perlmonk.org/ %
http://www.perlmonks.org/ % -- Meister Eckhart


 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      08-09-2004
Sara <(E-Mail Removed)> wrote:
> split /,/,'cat,mouse,,eel,fish';



> Huh? Where did the trailing items go?



Right where the contract (documentation) says they will go.

Why are you acting so surprised?


> I work around this inconsistency by adding in "placeholders" like:
>
> s/,,/,#,/g; s/,,/,#,/g;
>
> then do the split,then remove the #'s. What a treat. Thanks Larry!



If you sign a contract without reading it, you deserve any pain
that you get.

The corollary then is to read it before you sign it
(ie. read the docs for a function before you use the function)


--
Tad McClellan SGML consulting
http://www.velocityreviews.com/forums/(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
Sara
Guest
Posts: n/a
 
      08-10-2004
"John W. Krahn" <(E-Mail Removed)> wrote in message news:<2kLRc.74830$T_6.55276@edtnps89>...
> Sara wrote:
> > split /,/,'cat,mouse,,eel,fish';
> >
> > yields
> > 0 'cat'
> > 1 'mouse'
> > 2 ''
> > 3 'eel'
> > 4 'fish'
> >
> > Fine. Likewise:
> >
> > split /,/,',,,cat,,mouse,eel,fish';
> >
> > yields
> > 0 ''
> > 1 ''
> > 2 ''
> > 3 'cat'
> > 4 ''
> > 5 'mouse'
> > 6 'eel'
> > 7 'fish'
> >
> > Everybody is happy. But
> >
> > split /,/,'cat,mouse,eel,fish,,,';
> >
> > yields
> > 0 'cat'
> > 1 'mouse'
> > 2 'eel'
> > 3 'fish'
> >
> > Huh? Where did the trailing items go?

>
> If you had read the documentation for split you would see that that
> is the defined behavior for zero, one or two argument versions of split.
>
> perldoc -f split
> split /PATTERN/,EXPR,LIMIT
> split /PATTERN/,EXPR
> split /PATTERN/
> split Splits a string into a list of strings and returns that
> list. By default, empty leading fields are preserved,
> and empty trailing ones are deleted.
>
>
> You will also see that there is a third argument "LIMIT" which will solve
> your problem.
>
>
>
> John


Another incorrect presumption. I did read the documentation. I'm
talking about the choice of the default operation of split. I'm not
suggesting that it didn't work with additional arguments (which I
predect few if any of my programmers would recognize). I'm suggesting
it was a poor language design choice, and asking what advantage(s)
that choice has for the programmer. An entirely different question.

Sometimes the discussions here are not unlike the fundies argument
against evolution. You miss the whole point, because you believe Perl
is a divine doctrine that can't be questioned. It shouldn't upset you
because someone questions a language design choice.
 
Reply With Quote
 
Sara
Guest
Posts: n/a
 
      08-10-2004
Tad McClellan <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>.. .
> Sara <(E-Mail Removed)> wrote:
> > split /,/,'cat,mouse,,eel,fish';

>
>
> > Huh? Where did the trailing items go?

>
>
> Right where the contract (documentation) says they will go.
>
> Why are you acting so surprised?
>
>
> > I work around this inconsistency by adding in "placeholders" like:
> >
> > s/,,/,#,/g; s/,,/,#,/g;
> >
> > then do the split,then remove the #'s. What a treat. Thanks Larry!

>
>
> If you sign a contract without reading it, you deserve any pain
> that you get.
>
> The corollary then is to read it before you sign it
> (ie. read the docs for a function before you use the function)


Contract huh? With my legal-eagle hat on, a contract requires a
"meeting of the minds". I never agreed to this design choice, so there
is no contract, at least not with me.

I have yet to hear any valid reasons why this was a good design
choice. Just a lot of the usual robotic "read the docs" jabber. Tell
ya what-

"read the post"
 
Reply With Quote
 
Uri Guttman
Guest
Posts: n/a
 
      08-10-2004
>>>>> "S" == Sara <(E-Mail Removed)> writes:

S> I have yet to hear any valid reasons why this was a good design
S> choice. Just a lot of the usual robotic "read the docs" jabber. Tell
S> ya what-

learn awk. split's default behavior follows awk's.

and awk was a major influence on perl's design (hashes, -p loop, scalar
range, etc.)

uri

--
Uri Guttman ------ (E-Mail Removed) -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      08-10-2004
Sara <(E-Mail Removed)> wrote:
> Tad McClellan <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>.. .
>> Sara <(E-Mail Removed)> wrote:
>> > split /,/,'cat,mouse,,eel,fish';

>>
>>
>> > Huh? Where did the trailing items go?

>>
>>
>> Right where the contract (documentation) says they will go.



> Contract huh? With my legal-eagle hat on, a contract requires a
> "meeting of the minds".



If you agree to call the function, then there _has_ been a
meeting of the minds.

If you didn't want done what the function does, then you
wouldn't call that function.


> I never agreed to this design choice,



When you call the function, you are agreeing that it will do what
its docs say it will do.


> so there
> is no contract, at least not with me.
>
> I have yet to hear any valid reasons why this was a good design
> choice.



Pass a -1 as a 3rd argument, and that design choice will become moot.


> Just a lot of the usual robotic "read the docs" jabber.



The docs tell you how to disable the design choice that
you object to. Simply disable it and move on, no biggie.


> Tell
> ya what-
>
> "read the post"



Tell you what, meet the killfile.


--
Tad McClellan SGML consulting
(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
a split is not a split Dumbell Computer Support 3 03-09-2009 10:45 PM
String#split(/\s+/) vs. String#split(/(\s+)/) Sam Kong Ruby 5 08-12-2006 07:59 PM
How can I split database results with ExecuteReader and Split? needin4mation@gmail.com ASP .Net 2 05-05-2006 10:36 PM
split on '' (and another for split -1) trans. (T. Onoma) Ruby 10 12-28-2004 06:36 AM
Small inconsistency between string.split and "".split Carlos Ribeiro Python 11 09-17-2004 05:57 PM



Advertisments