Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > ASP .Net > encoding problem

Reply
Thread Tools

encoding problem

 
 
Jim Lawton
Guest
Posts: n/a
 
      01-11-2005
Hi,

..net c# httphandler straight html form at browser.

GBP pound sign problem (I know I know - I *can* decode it, but I've got to
understand what and why I should be doing stuff)

I am uploading text data from a form. This data is either directly input into a
textarea, or is a file stream originating from a .txt file, (or other basic text
file (like off Mac or Unix - of course I don't necessarily know at present it's
only .txt)

The page encoding is :-
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

On arrival at the server the content encoding is, sure enough UTF8.

Data input via the textarea and input to a string is displayed in the debugger
as pounds ()

Data input as a filestream has in the stream single bytes containing 0xA3 for
the GBP pound sign.

I process the input stream like this :-

public static string StreamToString(Stream aStream)
{ {
aStream.Position = 0;
long i = aStream.Length;
byte[] buffer = new byte[i];

aStream.Read(buffer,0,(int)aStream.Length);
return BytesToUTF8String(buffer);
}

public static string BytesToUTF8String(byte[] Array)
{
Encoding utf8 = Encoding.UTF8;
char[] utf8Chars = new char[utf8.GetCharCount(Array, 0,Array.Length)];
utf8.GetChars(Array, 0, Array.Length, utf8Chars, 0);

return new string(utf8Chars);
}

The resulting string contains nothing ...

If I use ASCII instead of UTF8, I get sense except my GBP signs are query ?
marks.

If I use UTF7 I get an apparently OK decoding.

I am dubious about using UTF7 for no better reason than that it works. Is there
logic here? What should I be doing?

Thanks,
Jim
 
Reply With Quote
 
 
 
 
bruce barker
Guest
Posts: n/a
 
      01-11-2005
it doesn't really matter what encoding you use for the page response, whats
important is the encoding used on the post from the browser. the browser
picks this (though often it will match). you should check the content-type
header the browser sends to determine the character set. for a html form
post (application/x-www-form-urlencoded) IS0-8859-1 is the default character
set not utf8.

-- bruce (sqlwork.com)


"Jim Lawton" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
| Hi,
|
| .net c# httphandler straight html form at browser.
|
| GBP pound sign problem (I know I know - I *can* decode it, but I've got to
| understand what and why I should be doing stuff)
|
| I am uploading text data from a form. This data is either directly input
into a
| textarea, or is a file stream originating from a .txt file, (or other
basic text
| file (like off Mac or Unix - of course I don't necessarily know at present
it's
| only .txt)
|
| The page encoding is :-
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
| On arrival at the server the content encoding is, sure enough UTF8.
|
| Data input via the textarea and input to a string is displayed in the
debugger
| as pounds ()
|
| Data input as a filestream has in the stream single bytes containing 0xA3
for
| the GBP pound sign.
|
| I process the input stream like this :-
|
| public static string StreamToString(Stream aStream)
| { {
| aStream.Position = 0;
| long i = aStream.Length;
| byte[] buffer = new byte[i];
|
| aStream.Read(buffer,0,(int)aStream.Length);
| return BytesToUTF8String(buffer);
| }
|
| public static string BytesToUTF8String(byte[] Array)
| {
| Encoding utf8 = Encoding.UTF8;
| char[] utf8Chars = new char[utf8.GetCharCount(Array, 0,Array.Length)];
| utf8.GetChars(Array, 0, Array.Length, utf8Chars, 0);
|
| return new string(utf8Chars);
| }
|
| The resulting string contains nothing ...
|
| If I use ASCII instead of UTF8, I get sense except my GBP signs are query
?
| marks.
|
| If I use UTF7 I get an apparently OK decoding.
|
| I am dubious about using UTF7 for no better reason than that it works. Is
there
| logic here? What should I be doing?
|
| Thanks,
| Jim


 
Reply With Quote
 
 
 
 
Jim Lawton
Guest
Posts: n/a
 
      01-12-2005
On Tue, 11 Jan 2005 10:03:03 -0800, "bruce barker" <(E-Mail Removed)>
wrote:

>it doesn't really matter what encoding you use for the page response, whats
>important is the encoding used on the post from the browser. the browser
>picks this (though often it will match). you should check the content-type
>header the browser sends to determine the character set. for a html form
>post (application/x-www-form-urlencoded) IS0-8859-1 is the default character
>set not utf8.
>
>-- bruce (sqlwork.com)


Thanks Bruce,

for anyone googling this topic in future, there's more in
dotnet.languages.csharp
Message-ID: <(E-Mail Removed)>

cheers Jim


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
encoding problem with BeautifulSoup - problem when writing parsedtext to file Greg Python 9 10-08-2011 03:30 PM
Reading Text File Encoding and converting to Perls internal UTF-8 encoding sln@netherlands.com Perl Misc 2 04-17-2009 11:22 PM
changing JVM encoding; setting -Dfile.encoding doesn't work pasmol@plusnet.pl Java 1 10-08-2004 09:50 PM
Encoding.Default and Encoding.UTF8 Hardy Wang ASP .Net 5 06-09-2004 04:04 PM
Problem encoding/decoding image Slade ASP .Net 1 06-25-2003 09:28 AM



Advertisments