Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > decoding a byte array that is unicode escaped?

Reply
Thread Tools

decoding a byte array that is unicode escaped?

 
 
sam
Guest
Posts: n/a
 
      11-06-2009
I have a byte stream read over the internet:

responseByteStream = urllib.request.urlopen( httpRequest );
responseByteArray = responseByteStream.read();

The characters are encoded with unicode escape sequences, for example
a copyright symbol appears in the stream as the bytes:

5C 75 30 30 61 39

which translates to:
\u00a9

which is unicode for the copyright symbol.

I am simply trying to display this copyright symbol on a webpage, so
how do I encode the byte array to utf-8 given that it is 'escape
encoded' in the above way? I tried:

responseByteArray.decode('utf-8')
and responseByteArray.decode('unicode_escape')
and str(responseByteArray).

I am using Python 3.1.

 
Reply With Quote
 
 
 
 
Peter Otten
Guest
Posts: n/a
 
      11-06-2009
sam wrote:

> I have a byte stream read over the internet:
>
> responseByteStream = urllib.request.urlopen( httpRequest );
> responseByteArray = responseByteStream.read();
>
> The characters are encoded with unicode escape sequences, for example
> a copyright symbol appears in the stream as the bytes:
>
> 5C 75 30 30 61 39
>
> which translates to:
> \u00a9
>
> which is unicode for the copyright symbol.
>
> I am simply trying to display this copyright symbol on a webpage, so
> how do I encode the byte array to utf-8 given that it is 'escape
> encoded' in the above way? I tried:
>
> responseByteArray.decode('utf-8')
> and responseByteArray.decode('unicode_escape')
> and str(responseByteArray).
>
> I am using Python 3.1.


Convert the bytes to unicode first:

>>> u = b"\\u00a9".decode("unicode-escape")
>>> u

'©'

Then convert the string to bytes:

>>> u.encode("utf-8")

b'\xc2\xa9'


 
Reply With Quote
 
 
 
 
strong.drug@gmail.com
Guest
Posts: n/a
 
      08-13-2012
пятница, 6 ноября 2009*г., 12:48:47 UTC+4 пользователь sam написал:

> I am simply trying to display this copyright symbol on a webpage, so
> how do I encode the byte array to utf-8 given that it is 'escape
> encoded' in the above way? I tried:
>
> responseByteArray.decode('utf-8')
> and responseByteArray.decode('unicode_escape')
> and str(responseByteArray).
>
> I am using Python 3.1.

I had some problem with reading zip archive in raw (binary) mode.
I solve it this way
.....
open (filename, 'rb').read ().encode('string_escape')
# now we had strings with strange symbols are escaped
# than we can handle it without decoding excepions for example:
body = '\r\n'.join (lines)
.....
# if we have unescaped strings we can get an exception there
# after opertions, we needed we must unescape all content
# and drop it out to network (in my case)
body = body.decode('string-escape')
.....
# then we can send so to the server
connection.request('POST', upload_url, body, headers)

BR)
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
problem in running a basic code in python 3.3.0 that includes HTML file Satabdi Mukherjee Python 1 04-04-2013 07:48 PM
convert form byte[4] to Int32 while retaining the binary value of the byte array jeff@foundrymusic.com C++ 20 09-07-2009 08:54 PM
Re: how to extend a byte[] array with a null byte? Tom McGlynn Java 4 04-18-2008 11:49 PM
Converting a Primative byte array to a Byte array object Kirby Java 3 10-08-2004 03:01 AM
Appending byte[] to another byte[] array Bharat Bhushan Java 15 08-05-2003 07:52 PM



Advertisments