Go Back   Velocity Reviews > Newsgroups > PERL
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

PERL - How to get length of string? length() problems

 
Thread Tools Search this Thread
Old 07-15-2003, 06:46 PM   #1
Default How to get length of string? length() problems


Simplified a bit, I'm parsing HTML documents to get sentences e.g.
my $html = get($URL);
# remove all HTML TAGs...blah blah blah
@sentences = split(/\./, $html));
then I'm trying to determine the number of characters in the sentence.
However, although when I print the sentences they look fine, when I use
length($sentence[0]) I get values in the hundreds for small sentences. Most
documentation I found said "length() returns the number of chars" however,
some said "length() returns the number of bytes". To get the number of
chars in this case, can I just divide by 8 or something?

Thanks for your help.
Mitchua




Mitchua
  Reply With Quote
Old 07-15-2003, 07:40 PM   #2
Mitchua
 
Posts: n/a
Default Re: How to get length of string? length() problems

"Mitchua" <> wrote in message
news:V5XQa.53675$ ble.rogers.com...
> Simplified a bit, I'm parsing HTML documents to get sentences e.g.
> my $html = get($URL);
> # remove all HTML TAGs...blah blah blah
> @sentences = split(/\./, $html));
> then I'm trying to determine the number of characters in the sentence.
> However, although when I print the sentences they look fine, when I use
> length($sentence[0]) I get values in the hundreds for small sentences.

Most
> documentation I found said "length() returns the number of chars" however,
> some said "length() returns the number of bytes". To get the number of
> chars in this case, can I just divide by 8 or something?
>


Would something like sprintf("%20s", $sentence[0]) work to crop the sentence
to 20 characters?

--Mitchua


  Reply With Quote
Old 07-15-2003, 08:56 PM   #3
Rich
 
Posts: n/a
Default Re: How to get length of string? length() problems

Mitchua wrote:

> "Mitchua" <> wrote in message
> news:V5XQa.53675$ ble.rogers.com...
>> Simplified a bit, I'm parsing HTML documents to get sentences e.g.
>> my $html = get($URL);
>> # remove all HTML TAGs...blah blah blah
>> @sentences = split(/\./, $html));
>> then I'm trying to determine the number of characters in the sentence.
>> However, although when I print the sentences they look fine, when I use
>> length($sentence[0]) I get values in the hundreds for small sentences.

> Most
>> documentation I found said "length() returns the number of chars"
>> however,
>> some said "length() returns the number of bytes". To get the number of
>> chars in this case, can I just divide by 8 or something?
>>

>
> Would something like sprintf("%20s", $sentence[0]) work to crop the
> sentence to 20 characters?
>
> --Mitchua


perldoc -f length:

"length EXPR
length Returns the length in characters of the value of EXPR..."


BUT length() returns the length in bytes when the bytes pragma is used, eg:

$x = chr(400);
print "Length is ", length $x, "\n"; # "Length is 1"
printf "Contents are %vd\n", $x; # "Contents are 400"
{
use bytes;
print "Length is ", length $x, "\n"; # "Length is 2"
printf "Contents are %vd\n", $x; # "Contents are 198.144"
}

perldoc bytes for more info.

Cheers,
--
Rich

  Reply With Quote
Old 07-16-2003, 04:08 AM   #4
Eric J. Roode
 
Posts: n/a
Default Re: How to get length of string? length() problems

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

"Mitchua" <> wrote in
news:V5XQa.53675$ ble.rogers.com:

> Simplified a bit, I'm parsing HTML documents to get sentences e.g.
> my $html = get($URL);
> # remove all HTML TAGs...blah blah blah
> @sentences = split(/\./, $html));
> then I'm trying to determine the number of characters in the sentence.
> However, although when I print the sentences they look fine, when I
> use length($sentence[0]) I get values in the hundreds for small
> sentences. Most documentation I found said "length() returns the
> number of chars" however, some said "length() returns the number of
> bytes". To get the number of chars in this case, can I just divide by
> 8 or something?


Only if your characters are 8 bytes wide!

Do you have an example of input data that exhibits this length()
discrepancy? Can you include the output of something like:

print "[[[$string]]] ", length($string), "\n";

- --
Eric
$_ = reverse sort qw p ekca lre Js reh ts
p, $/.r, map $_.$", qw e p h tona e; print

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <http://www.pgp.com>

iQA/AwUBPxTBu2PeouIeTNHoEQIJcgCeNrC1lDNYKBtdGsL5Bw0bxd IM2BMAnRAr
vTZutckih5KT81pj/63k5mDZ
=1LLa
-----END PGP SIGNATURE-----
  Reply With Quote
Old 07-17-2003, 12:09 AM   #5
Mitchua
 
Posts: n/a
Default Re: How to get length of string? length() problems


"Eric J. Roode" <> wrote in message
news:Xns93B9EB73EF613sdn.comcast@206.127.4.25...
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> "Mitchua" <> wrote in
> news:V5XQa.53675$ ble.rogers.com:
>
> > Simplified a bit, I'm parsing HTML documents to get sentences e.g.
> > my $html = get($URL);
> > # remove all HTML TAGs...blah blah blah
> > @sentences = split(/\./, $html));
> > then I'm trying to determine the number of characters in the sentence.
> > However, although when I print the sentences they look fine, when I
> > use length($sentence[0]) I get values in the hundreds for small
> > sentences. Most documentation I found said "length() returns the
> > number of chars" however, some said "length() returns the number of
> > bytes". To get the number of chars in this case, can I just divide by
> > 8 or something?

>
> Only if your characters are 8 bytes wide!
>
> Do you have an example of input data that exhibits this length()
> discrepancy?


Checkout Rich's reply. My problem was that I was using length($sentence)
instead of length $sentence. Once I changed that, it was all good. Thanks
for the reply.

Mitchua


  Reply With Quote
Old 07-17-2003, 01:08 AM   #6
Eric J. Roode
 
Posts: n/a
Default Re: How to get length of string? length() problems

"Mitchua" <> wrote in
news:7XkRa.89580$ ble.rogers.com:

> Checkout Rich's reply. My problem was that I was using
> length($sentence) instead of length $sentence. Once I changed that,
> it was all good. Thanks for the reply.


Hmmm. I fail to see how that could possibly make a difference. But hey,
whatever works is good.

--
Eric
$_ = reverse sort qw p ekca lre Js reh ts
p, $/.r, map $_.$", qw e p h tona e; print
  Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump