Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Unicode: Strings marked 'utf8'. Can they be converted to 'byte' without going the vec() route?

Reply
Thread Tools

Unicode: Strings marked 'utf8'. Can they be converted to 'byte' without going the vec() route?

 
 
sln@netherlands.com
Guest
Posts: n/a
 
      08-03-2009
Below is my sample code. This works but if I could just get
a byte string from a *possible* utf8 string with anything simpler
than this, I would be a happy camper.

In the real app, I have no control over how the sample is generated.
Its likely read from PerlIO with whatever encoding layers are applied.
I don't want to have to worry about that, just get it back to a byte
string for analysis.

Thanks alot.
-sln

--------------------------

use strict;
use warnings;

my $sample = "unicode->\x{feff}\x{21000}\x{21000}";

print "\nUTF string, length = ".length($sample).", '$sample' :\n ";
for (map {ord $_} split //, $sample) {
printf ("%x ",$_);
}
print "\n";

my ($bytes, $offset) = ('',0);
for (map {ord $_} split //, $sample)
{
my @ar = ();
while ($_ > 0) {
push @ar, $_ & 0xff;
$_ >>= 8;
}
for (reverse @ar) {
vec ($bytes, $offset++, = $_;
}
}

print "\nByte converted, length = ".length($bytes).", '$bytes' :\n ";
for (map {ord $_} split //, $bytes) {
printf ("%02x ",$_);
}
print "\n";

__END__

Wide character in print at btest.pl line 6.

UTF string, length = 12, 'unicode->n++==' :
75 6e 69 63 6f 64 65 2d 3e feff 21000 21000

Byte converted, length = 17, 'unicode->*?? ?? ' :
75 6e 69 63 6f 64 65 2d 3e fe ff 02 10 00 02 10 00


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
half-pixels, should they be converted to integer ? Stevo Javascript 4 09-11-2009 01:20 PM
How can I make a class that can be converted into an int? Matthew Wilson Python 4 10-02-2006 03:17 PM
Strings, Strings and Damned Strings Ben C Programming 14 06-24-2006 05:09 AM
"Because of 9-11";cops think they can do anything they want to photographers in NY qtraindash7@optonline.net Digital Photography 81 05-27-2005 10:16 PM
they turn, they power, they make nice pics Keith and Jenn Z. Digital Photography 0 09-21-2003 04:16 AM



Advertisments