Yohan N. Leder wrote:
> All my tests are done using ActivePerl 5.8.8.817 under Win2K FR and
> Apache2.
>
> I'm trying to obtain (and display) user data which come from a web form
> with enctype as 'application/x-www-form-urlencoded' and don't succeed. I
> can do-it if the form is a 'multipart/form-data' but not a
> 'application/x-www-form-urlencoded'.
[snip code ]
> For example, if I submit the 'urlencoded' form (the first one, at top of
> generated web page, if you run the script without any url parameter)
> with the letter 'é' (accentuated e) inside the textarea, I get 'msg=%C3%
> A9' displayed in the browser (knowing this has been proceeded through
> the see() sub).
>
> While, if I submit the same 'é' from the 'multipart/form-data' form (the
> second one, at bottom of generated web page), I get a well interpreted
> UTF-8 'é' as expected.
>
> How to get this same UTF-8 'é' when form uses 'application/x-www-form-
> urlencoded' enctype ? How to modify the see() sub for this urlencoded
> form case ?
That shouldn't be particularly mysterious. You're specifying the page's
charset as UTF-8 in its header (where you say "Content-type: text/html;
charset=UTF-8"), causing the 'é'- character to be sent as Unicode's
literal 'é' (dec 142/hex 8E/eacute/LATIN SMALL LETTER E WITH ACUTE.
The code point for à is C3, and for © it's A9, thus the expected
value becomes %C3%A9.
Encoding é -> é -> %C3%A9 :
#!/usr/bin/perl -w
my $posteddata = <STDIN>;
print <<PAGE
Content-type: text/html; charset=UTF-8
<html><body>
Posted data: $posteddata<hr>
<form action='f.pl' method='post'>
<textarea name='msg'></textarea>
<input type='submit'>
</form></body></html>
PAGE
Whereas the "normal" form encoding would be é -> %E9:
#!/usr/bin/perl -w
my $posteddata = <STDIN>;
print <<PAGE
Content-type: text/html
<html><body>
Posted data: $posteddata<hr>
<form action='f.pl' method='post'>
<textarea name='msg'></textarea>
<input type='submit'>
</form></body></html>
PAGE
P.S. 'application/x-www-form-urlencoded' is the default form encoding
type anyhow, so there is actually no need to set this as a form
argument.
Recommended literature:
http://home.tiscali.nl/t876506/utf8tbl.html (search for string C3A9 on
that page)
Table CPs < 256:
http://en.wikipedia.org/wiki/ISO_8859-1
And of course Perl FAQ/docs, as Gunnar pointed out.
--
Bart