Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Perl (http://www.velocityreviews.com/forums/f17-perl.html)
-   -   How to get length of string? length() problems (http://www.velocityreviews.com/forums/t24167-how-to-get-length-of-string-length-problems.html)

Mitchua 07-15-2003 05:46 PM

How to get length of string? length() problems
 
Simplified a bit, I'm parsing HTML documents to get sentences e.g.
my $html = get($URL);
# remove all HTML TAGs...blah blah blah
@sentences = split(/\./, $html));
then I'm trying to determine the number of characters in the sentence.
However, although when I print the sentences they look fine, when I use
length($sentence[0]) I get values in the hundreds for small sentences. Most
documentation I found said "length() returns the number of chars" however,
some said "length() returns the number of bytes". To get the number of
chars in this case, can I just divide by 8 or something?

Thanks for your help.
Mitchua



Mitchua 07-15-2003 06:40 PM

Re: How to get length of string? length() problems
 
"Mitchua" <mitchua@yahoo.com> wrote in message
news:V5XQa.53675$1aB1.35315@news02.bloor.is.net.ca ble.rogers.com...
> Simplified a bit, I'm parsing HTML documents to get sentences e.g.
> my $html = get($URL);
> # remove all HTML TAGs...blah blah blah
> @sentences = split(/\./, $html));
> then I'm trying to determine the number of characters in the sentence.
> However, although when I print the sentences they look fine, when I use
> length($sentence[0]) I get values in the hundreds for small sentences.

Most
> documentation I found said "length() returns the number of chars" however,
> some said "length() returns the number of bytes". To get the number of
> chars in this case, can I just divide by 8 or something?
>


Would something like sprintf("%20s", $sentence[0]) work to crop the sentence
to 20 characters?

--Mitchua



Rich 07-15-2003 07:56 PM

Re: How to get length of string? length() problems
 
Mitchua wrote:

> "Mitchua" <mitchua@yahoo.com> wrote in message
> news:V5XQa.53675$1aB1.35315@news02.bloor.is.net.ca ble.rogers.com...
>> Simplified a bit, I'm parsing HTML documents to get sentences e.g.
>> my $html = get($URL);
>> # remove all HTML TAGs...blah blah blah
>> @sentences = split(/\./, $html));
>> then I'm trying to determine the number of characters in the sentence.
>> However, although when I print the sentences they look fine, when I use
>> length($sentence[0]) I get values in the hundreds for small sentences.

> Most
>> documentation I found said "length() returns the number of chars"
>> however,
>> some said "length() returns the number of bytes". To get the number of
>> chars in this case, can I just divide by 8 or something?
>>

>
> Would something like sprintf("%20s", $sentence[0]) work to crop the
> sentence to 20 characters?
>
> --Mitchua


perldoc -f length:

"length EXPR
length Returns the length in characters of the value of EXPR..."


BUT length() returns the length in bytes when the bytes pragma is used, eg:

$x = chr(400);
print "Length is ", length $x, "\n"; # "Length is 1"
printf "Contents are %vd\n", $x; # "Contents are 400"
{
use bytes;
print "Length is ", length $x, "\n"; # "Length is 2"
printf "Contents are %vd\n", $x; # "Contents are 198.144"
}

perldoc bytes for more info.

Cheers,
--
Rich
scriptyrich@yahoo.co.uk

Eric J. Roode 07-16-2003 03:08 AM

Re: How to get length of string? length() problems
 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

"Mitchua" <mitchua@yahoo.com> wrote in
news:V5XQa.53675$1aB1.35315@news02.bloor.is.net.ca ble.rogers.com:

> Simplified a bit, I'm parsing HTML documents to get sentences e.g.
> my $html = get($URL);
> # remove all HTML TAGs...blah blah blah
> @sentences = split(/\./, $html));
> then I'm trying to determine the number of characters in the sentence.
> However, although when I print the sentences they look fine, when I
> use length($sentence[0]) I get values in the hundreds for small
> sentences. Most documentation I found said "length() returns the
> number of chars" however, some said "length() returns the number of
> bytes". To get the number of chars in this case, can I just divide by
> 8 or something?


Only if your characters are 8 bytes wide!

Do you have an example of input data that exhibits this length()
discrepancy? Can you include the output of something like:

print "[[[$string]]] ", length($string), "\n";

- --
Eric
$_ = reverse sort qw p ekca lre Js reh ts
p, $/.r, map $_.$", qw e p h tona e; print

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <http://www.pgp.com>

iQA/AwUBPxTBu2PeouIeTNHoEQIJcgCeNrC1lDNYKBtdGsL5Bw0bxd IM2BMAnRAr
vTZutckih5KT81pj/63k5mDZ
=1LLa
-----END PGP SIGNATURE-----

Mitchua 07-16-2003 11:09 PM

Re: How to get length of string? length() problems
 

"Eric J. Roode" <REMOVEsdnCAPS@comcast.net> wrote in message
news:Xns93B9EB73EF613sdn.comcast@206.127.4.25...
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> "Mitchua" <mitchua@yahoo.com> wrote in
> news:V5XQa.53675$1aB1.35315@news02.bloor.is.net.ca ble.rogers.com:
>
> > Simplified a bit, I'm parsing HTML documents to get sentences e.g.
> > my $html = get($URL);
> > # remove all HTML TAGs...blah blah blah
> > @sentences = split(/\./, $html));
> > then I'm trying to determine the number of characters in the sentence.
> > However, although when I print the sentences they look fine, when I
> > use length($sentence[0]) I get values in the hundreds for small
> > sentences. Most documentation I found said "length() returns the
> > number of chars" however, some said "length() returns the number of
> > bytes". To get the number of chars in this case, can I just divide by
> > 8 or something?

>
> Only if your characters are 8 bytes wide!
>
> Do you have an example of input data that exhibits this length()
> discrepancy?


Checkout Rich's reply. My problem was that I was using length($sentence)
instead of length $sentence. Once I changed that, it was all good. Thanks
for the reply.

Mitchua



Eric J. Roode 07-17-2003 12:08 AM

Re: How to get length of string? length() problems
 
"Mitchua" <mitchua@yahoo.com> wrote in
news:7XkRa.89580$sI91.77734@news04.bloor.is.net.ca ble.rogers.com:

> Checkout Rich's reply. My problem was that I was using
> length($sentence) instead of length $sentence. Once I changed that,
> it was all good. Thanks for the reply.


Hmmm. I fail to see how that could possibly make a difference. But hey,
whatever works is good.

--
Eric
$_ = reverse sort qw p ekca lre Js reh ts
p, $/.r, map $_.$", qw e p h tona e; print


All times are GMT. The time now is 10:04 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.