Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > ASP .Net > ASP General > UTF8>UNICODE

Reply
Thread Tools

UTF8>UNICODE

 
 
Meelis Lilbok
Guest
Posts: n/a
 
      04-25-2006
Hi

My ASP pages uses UTF-8 encoding.

How to convert UTF-8 text from Request.Form("text") to UNICODE for searching
frm MSSQL Database?



Best regards;
Meelis



 
Reply With Quote
 
 
 
 
Anthony Jones
Guest
Posts: n/a
 
      04-25-2006

"Meelis Lilbok" <(E-Mail Removed)> wrote in message
news:%(E-Mail Removed)...
> Hi
>
> My ASP pages uses UTF-8 encoding.
>
> How to convert UTF-8 text from Request.Form("text") to UNICODE for

searching
> frm MSSQL Database?
>
>
>
> Best regards;
> Meelis
>
>
>


x = Request.Form("text").

x now contains a Unicode string

When passing to a ADO command object parameter make sure the parameter type
is adVarWChar.

Anthony.


 
Reply With Quote
 
 
 
 
Meelis Lilbok
Guest
Posts: n/a
 
      04-25-2006
>
> x = Request.Form("text").


Nope, x is in UTF-8 format! Thats the problem

I use activex dll and API calls to convert UTF-8 to UNICODE, but where use
of activex is disabled this will not work

Meelis


 
Reply With Quote
 
Anthony Jones
Guest
Posts: n/a
 
      04-25-2006

"Meelis Lilbok" <(E-Mail Removed)> wrote in message
news:%23G%(E-Mail Removed)...
> >
> > x = Request.Form("text").

>
> Nope, x is in UTF-8 format! Thats the problem
>
> I use activex dll and API calls to convert UTF-8 to UNICODE, but where use
> of activex is disabled this will not work
>
> Meelis
>


VBScript supports only one string format and that is Unicode.

I suspect that the form submission is using UTF-8 but the server side script
doesn't know that and is treating it as ISO-8859-1 or the like. Hence you
are getting a Unicode string that contains a series of UTF-8 encodings.

What is the character encoding of page that contains the text control?

Does the page actually inform the client of the character encoding used for
the page?

What method is used to submit the form GET or POST?

What is the Enctype of the form?

Is AcceptCharset specified for the Form?

What Browser are you using?

Anthony.


 
Reply With Quote
 
Meelis Lilbok
Guest
Posts: n/a
 
      04-25-2006
> What is the character encoding of page that contains the text control?
UTF-8


>
> Does the page actually inform the client of the character encoding used
> for
> the page?

Yes


> What method is used to submit the form GET or POST?

POST

> What is the Enctype of the form?


None, because page encoding is UTF-8

> Is AcceptCharset specified for the Form?

No

> What Browser are you using?

IE6

Meelis


 
Reply With Quote
 
Meelis Lilbok
Guest
Posts: n/a
 
      04-25-2006
For example

If i enter into text box estonian word "všike"
and submit form to antoher pages search.asp
and read Request.Form("text")
i get väike (UTF-



Meelis








"Anthony Jones" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
>
> "Meelis Lilbok" <(E-Mail Removed)> wrote in message
> news:%23G%(E-Mail Removed)...
>> >
>> > x = Request.Form("text").

>>
>> Nope, x is in UTF-8 format! Thats the problem
>>
>> I use activex dll and API calls to convert UTF-8 to UNICODE, but where
>> use
>> of activex is disabled this will not work
>>
>> Meelis
>>

>
> VBScript supports only one string format and that is Unicode.
>
> I suspect that the form submission is using UTF-8 but the server side
> script
> doesn't know that and is treating it as ISO-8859-1 or the like. Hence you
> are getting a Unicode string that contains a series of UTF-8 encodings.
>
> What is the character encoding of page that contains the text control?
>
> Does the page actually inform the client of the character encoding used
> for
> the page?
>
> What method is used to submit the form GET or POST?
>
> What is the Enctype of the form?
>
> Is AcceptCharset specified for the Form?
>
> What Browser are you using?
>
> Anthony.
>
>



 
Reply With Quote
 
Anthony Jones
Guest
Posts: n/a
 
      04-25-2006

"Meelis Lilbok" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> For example
>
> If i enter into text box estonian word "všike"
> and submit form to antoher pages search.asp
> and read Request.Form("text")
> i get väike (UTF-
>
>


Having looked into it a bit more it would seem that the forms approach just
isn't compatible with UTF-8 or unicode. There doesn't seem to be a way to
inform the server of the actual charset used to encode the form values.

I'm actually quite amazed at this.

What do you actually need to do?

Do you need to support input characters beyond ISO-8859-1? If not I would
suggest you ditch UTF-8 and use ISO-8859-1 everywhere instead.

Other wise it is possible to do the decoding in VBScript yourself but it's
really messy. A small VB6 component would make this a lot easier.

Ditching Forms may be another option and post XML instead. (This is what I
do, I don't use forms)

Anthony.


 
Reply With Quote
 
Meelis Lilbok
Guest
Posts: n/a
 
      04-26-2006
Hi

cant use ISO-8859-1, beacuse i need support cyrillic chars too.
its easier to use my activex dll with convert functions )


Best Regadrs;
Meelis




"Anthony Jones" <(E-Mail Removed)> wrote in message
news:%(E-Mail Removed)...
>
> "Meelis Lilbok" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>> For example
>>
>> If i enter into text box estonian word "všike"
>> and submit form to antoher pages search.asp
>> and read Request.Form("text")
>> i get väike (UTF-
>>
>>

>
> Having looked into it a bit more it would seem that the forms approach
> just
> isn't compatible with UTF-8 or unicode. There doesn't seem to be a way to
> inform the server of the actual charset used to encode the form values.
>
> I'm actually quite amazed at this.
>
> What do you actually need to do?
>
> Do you need to support input characters beyond ISO-8859-1? If not I would
> suggest you ditch UTF-8 and use ISO-8859-1 everywhere instead.
>
> Other wise it is possible to do the decoding in VBScript yourself but it's
> really messy. A small VB6 component would make this a lot easier.
>
> Ditching Forms may be another option and post XML instead. (This is what
> I
> do, I don't use forms)
>
> Anthony.
>
>



 
Reply With Quote
 
Egbert Nierop \(MVP for IIS\)
Guest
Posts: n/a
 
      04-26-2006

"Meelis Lilbok" <(E-Mail Removed)> wrote in message
news:%(E-Mail Removed)...
> Hi
>
> My ASP pages uses UTF-8 encoding.
>
> How to convert UTF-8 text from Request.Form("text") to UNICODE for
> searching frm MSSQL Database?


use at the first line of your ASP page
<% codepage=65001%>

--
compatible web farm Session replacement for Asp and Asp.Net
http://www.nieropwebconsult.nl/asp_session_manager.htm

 
Reply With Quote
 
Anthony Jones
Guest
Posts: n/a
 
      04-26-2006

"Egbert Nierop (MVP for IIS)" <(E-Mail Removed)> wrote in
message news:(E-Mail Removed)...
>
> "Meelis Lilbok" <(E-Mail Removed)> wrote in message
> news:%(E-Mail Removed)...
> > Hi
> >
> > My ASP pages uses UTF-8 encoding.
> >
> > How to convert UTF-8 text from Request.Form("text") to UNICODE for
> > searching frm MSSQL Database?

>
> use at the first line of your ASP page
> <% codepage=65001%>
>


did you mean:-

<%@ codepage=65001 %>

I don't think that helps. The value of session.codepage doesn't seem to
impact the assumptions made by server about the encoding of the request
data.



> --
> compatible web farm Session replacement for Asp and Asp.Net
> http://www.nieropwebconsult.nl/asp_session_manager.htm
>



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off




Advertisments