Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > WTF? Printing unicode strings

Reply
Thread Tools

WTF? Printing unicode strings

 
 
Ron Garret
Guest
Posts: n/a
 
      05-18-2006
>>> u'\xbd'
u'\xbd'
>>> print _

Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\xbd' in
position 0: ordinal not in range(12
>>>

 
Reply With Quote
 
 
 
 
John Salerno
Guest
Posts: n/a
 
      05-18-2006
Ron Garret wrote:
>>>> u'\xbd'

> u'\xbd'
>>>> print _

> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xbd' in
> position 0: ordinal not in range(12


Not sure if this really helps you, but:

>>> u'\xbd'

u'\xbd'
>>> print _


>>>

 
Reply With Quote
 
 
 
 
Fredrik Lundh
Guest
Posts: n/a
 
      05-18-2006
Ron Garret wrote:

>>>> u'\xbd'

> u'\xbd'
>>>> print _

> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xbd' in
> position 0: ordinal not in range(12


so stdout on your machine is ascii, and you don't understand why you
cannot print a non-ascii unicode character to it? wtf?

</F>

 
Reply With Quote
 
Ron Garret
Guest
Posts: n/a
 
      05-18-2006
In article <(E-Mail Removed)>,
Fredrik Lundh <(E-Mail Removed)> wrote:

> Ron Garret wrote:
>
> >>>> u'\xbd'

> > u'\xbd'
> >>>> print _

> > Traceback (most recent call last):
> > File "<stdin>", line 1, in ?
> > UnicodeEncodeError: 'ascii' codec can't encode character u'\xbd' in
> > position 0: ordinal not in range(12

>
> so stdout on your machine is ascii, and you don't understand why you
> cannot print a non-ascii unicode character to it? wtf?
>
> </F>


I forgot to mention:

>>> sys.getdefaultencoding()

'utf-8'
>>> print u'\xbd'

Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\xbd' in
position 0: ordinal not in range(12
>>>

 
Reply With Quote
 
Robert Kern
Guest
Posts: n/a
 
      05-18-2006
Ron Garret wrote:

> I forgot to mention:
>
>>>>sys.getdefaultencoding()

>
> 'utf-8'


A) You shouldn't be able to do that.
B) Don't do that.
C) It's not relevant to the encoding of stdout which determines how unicode
strings get converted to bytes when printing them:

>>> import sys
>>> sys.stdout.encoding

'UTF-8'
>>> sys.getdefaultencoding()

'ascii'
>>> print u'\xbd'

½

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

 
Reply With Quote
 
Ron Garret
Guest
Posts: n/a
 
      05-18-2006
In article <(E-Mail Removed)>,
Robert Kern <(E-Mail Removed)> wrote:

> Ron Garret wrote:
>
> > I forgot to mention:
> >
> >>>>sys.getdefaultencoding()

> >
> > 'utf-8'

>
> A) You shouldn't be able to do that.


What can I say? I can.

> B) Don't do that.


OK. What should I do instead?

> C) It's not relevant to the encoding of stdout which determines how unicode
> strings get converted to bytes when printing them:
>
> >>> import sys
> >>> sys.stdout.encoding

> 'UTF-8'
> >>> sys.getdefaultencoding()

> 'ascii'
> >>> print u'\xbd'

> 1⁄2


OK, so how am I supposed to change the encoding of sys.stdout? It comes
up as US-ASCII on my system. Simply setting it doesn't work:

>>> import sys
>>> sys.stdout.encoding='utf-8'

Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: readonly attribute
>>>


rg
 
Reply With Quote
 
Robert Kern
Guest
Posts: n/a
 
      05-18-2006
Ron Garret wrote:
> In article <(E-Mail Removed)>,
> Robert Kern <(E-Mail Removed)> wrote:
>
>>Ron Garret wrote:
>>
>>>I forgot to mention:
>>>
>>>
>>>>>>sys.getdefaultencoding()
>>>
>>>'utf-8'

>>
>>A) You shouldn't be able to do that.

>
> What can I say? I can.


See B).

>>B) Don't do that.

>
> OK. What should I do instead?


See below.

>>C) It's not relevant to the encoding of stdout which determines how unicode
>>strings get converted to bytes when printing them:
>>
>>>>>import sys
>>>>>sys.stdout.encoding

>>
>>'UTF-8'
>>
>>>>>sys.getdefaultencoding()

>>
>>'ascii'
>>
>>>>>print u'\xbd'

>>
>>1⁄2

>
> OK, so how am I supposed to change the encoding of sys.stdout? It comes
> up as US-ASCII on my system. Simply setting it doesn't work:


You will have to use a terminal that accepts UTF-8.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

 
Reply With Quote
 
Serge Orlov
Guest
Posts: n/a
 
      05-18-2006
Ron Garret wrote:
> In article <(E-Mail Removed)>,
> Robert Kern <(E-Mail Removed)> wrote:
>
> > Ron Garret wrote:
> >
> > > I forgot to mention:
> > >
> > >>>>sys.getdefaultencoding()
> > >
> > > 'utf-8'

> >
> > A) You shouldn't be able to do that.

>
> What can I say? I can.
>
> > B) Don't do that.

>
> OK. What should I do instead?


Exact answer depends on what OS and terminal you are using and what
your program is supposed to do, are you going to distribute the program
or it's just for internal use.

 
Reply With Quote
 
Ron Garret
Guest
Posts: n/a
 
      05-18-2006
In article <(E-Mail Removed) .com>,
"Serge Orlov" <(E-Mail Removed)> wrote:

> Ron Garret wrote:
> > In article <(E-Mail Removed)>,
> > Robert Kern <(E-Mail Removed)> wrote:
> >
> > > Ron Garret wrote:
> > >
> > > > I forgot to mention:
> > > >
> > > >>>>sys.getdefaultencoding()
> > > >
> > > > 'utf-8'
> > >
> > > A) You shouldn't be able to do that.

> >
> > What can I say? I can.
> >
> > > B) Don't do that.

> >
> > OK. What should I do instead?

>
> Exact answer depends on what OS and terminal you are using and what
> your program is supposed to do, are you going to distribute the program
> or it's just for internal use.


I'm using an OS X terminal to ssh to a Linux machine.

But what about this:

>>> f2=open('foo','w')
>>> f2.write(u'\xFF')

Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\xff' in
position 0: ordinal not in range(12
>>>


That should have nothing to do with my terminal, right?

I just found http://www.amk.ca/python/howto/unicode, which seems to be
enlightening. The answer seems to be something like:

import codecs
f = codecs.open('foo','w','utf-8')

but that seems pretty awkward.

rg
 
Reply With Quote
 
Robert Kern
Guest
Posts: n/a
 
      05-18-2006
Ron Garret wrote:

> I'm using an OS X terminal to ssh to a Linux machine.


Click on the "Terminal" menu, then "Window Settings...". Choose "Display" from
the combobox. At the bottom you will see a combobox title "Character Set
Encoding". Choose "Unicode (UTF-".

> But what about this:
>
>>>>f2=open('foo','w')
>>>>f2.write(u'\xFF')

>
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xff' in
> position 0: ordinal not in range(12
>
> That should have nothing to do with my terminal, right?


Correct, that is a different problem. f.write() expects a string of bytes, not a
unicode string. In order to convert unicode strings to byte strings without an
explicit .encode() method call, Python uses the default encoding which is
'ascii'. It's not easily changeable for a good reason. Your modules won't work
on anyone else's machine if you hack that setting.

> I just found http://www.amk.ca/python/howto/unicode, which seems to be
> enlightening. The answer seems to be something like:
>
> import codecs
> f = codecs.open('foo','w','utf-8')
>
> but that seems pretty awkward.


<shrug> About as clean as it gets when dealing with text encodings.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: the official way of printing unicode strings Marc 'BlackJack' Rintsch Python 2 12-14-2008 11:25 PM
compare unicode to non-unicode strings Asterix Python 5 08-31-2008 07:31 PM
brochure printing,online yearbook,printing,books printing,publishing elie Computer Support 0 08-18-2007 10:11 AM
printing unicode strings 7stud Python 3 07-25-2007 01:22 AM
Strings, Strings and Damned Strings Ben C Programming 14 06-24-2006 05:09 AM



Advertisments