howa wrote:
> Hello, consider my simple cgi program below:
>
> #=======
> #!/usr/bin/perl
> use strict;
>
> use CGI;
> my $q = new CGI;
> my $s = $q->param("s");
> print $q->header( -type => "text/html" );
>
> print utf8::valid ($s);
> #=======
>
> Then I call, e.g.
>
> http://www.example.com/cgi-bin/test.cgi?s=abc (print 1, ok)
> http://www.example.com/cgi-bin/test.cgi?s=$BCfJ8(B (also print 1, but my
> paramater s is BIG5 traditional Chinese encoding, not utf8!)
I'm not sure about the meaning of utf8::valid (), but the docs
recommends the use of utf8::is_utf8().
Does the below code make sense to you?
$ cat test.pl
use Encode;
$big5_uriencoded = '%A4%A4';
( $big5_bytes = $big5_uriencoded ) =~ s/%(..)/chr(hex $1)/eg;
print '$big5_bytes ', utf8::is_utf8($big5_bytes) ? 'is' : 'is not',
" in UTF-8 internally.\n";
$string = decode('Big5', $big5_bytes);
print '$string ', utf8::is_utf8($string) ? 'is' : 'is not',
" in UTF-8 internally.\n\n";
$ perl test.pl
$big5_bytes is not in UTF-8 internally.
$string is in UTF-8 internally.
I believe it tells us that it's not possible to encode $big5_bytes
directly to UTF-8, while that's possible with $string.
--
Gunnar Hjalmarsson
Email:
http://www.gunnar.cc/cgi-bin/contact.pl