Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Re: problem with unicode

Reply
Thread Tools

Re: problem with unicode

 
 
John Machin
Guest
Posts: n/a
 
      04-25-2008
On Apr 25, 9:15 pm, "(E-Mail Removed)"
<(E-Mail Removed)> wrote:
> Hi everybody,
>
> I'm using the win32 console and have the following short program
> excerpt
>
> # media is a binary string (mysql escaped zipped file)
>
> >> print media

>
> xワユロ[ヨ ...
> (works)
>
> >> print unicode(media)

>
> UnicodeDecodeError: 'ascii' codec can't decode byte 0x9c in position
> 1: ordinal not in range(12
> (ok i guess print assumes you want to print to ascii)


Guessing is no substitute for reading the manual.

print has nothing to do with your problem; the problem is
unicode(media) -- as you specified no encoding, it uses the default
encoding, which is ascii [unless you have been mucking about, which is
not recommended]. As the 2nd byte is 0x9c, ascii is going nowhere.


>
> >> print unicode(media).encode('utf-8')

>
> UnicodeDecodeError: 'ascii' codec can't decode byte 0x9c in position
> 1: ordinal not in range(12
> (why does this not work?)


Already unicode(media) "doesn't work", so naturally(?)
unicode(media).whatever() won't be better -- whatever won't be called.

>
> # mapString is a unicode string (i think at least)>> print "'" + mapString + "'"
>
> ' yu_200703_hello\ 831 v1234.9874 '
>
> >> mystr = "%s %s" % (mapString, media)

>
> UnicodeDecodeError: 'ascii' codec can't decode byte 0x9c in position
> 1: ordinal not in range(12
>
> >> mystr = "%s %s" % (mapString.encode('utf-8'), media.encode('utf-8'))

>
> UnicodeDecodeError: 'ascii' codec can't decode byte 0x9c in position
> 1: ordinal not in range(12


This is merely repeating the original problem.

>
> I don't know what to do. I just want to concatenate two string where
> apparently one is a binary string, the other one is a unicode string
> and I always seem to get this error.
>
> Any help is appreciated


We need a clue or two; do this and let us know what it says:

print type(media), repr(media)
print type(mapString), repr(mapString)
import sys; print sys.stdout.encoding

Also you say that "print media" works. Do you mean that it produces
some meaningful text that you understand? What I see on the screen in
Google Groups is the following 6 characters:
LATIN SMALL LETTER X
KATAKANA LETTER WA
KATAKANA LETTER YU
KATAKANA LETTER RO
LEFT SQUARE BRACKET
KATAKANA LETTER YO
Is that what you see?

What is it that you call "win32 console"?
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: os.lisdir, gets unicode, returns unicode... USUALLY?!?!? Jean-Paul Calderone Python 23 11-21-2006 10:25 AM
os.lisdir, gets unicode, returns unicode... USUALLY?!?!? gabor Python 13 11-18-2006 09:23 AM
Unicode digit to unicode string Gabriele *darkbard* Farina Python 2 05-16-2006 01:15 PM
unicode wrap unicode object? ygao Python 6 04-08-2006 09:54 AM
Unicode + jsp + mysql + tomcat = unicode still not displaying Robert Mark Bram Java 0 09-28-2003 05:37 AM



Advertisments