Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Unicode question

Reply
Thread Tools

Unicode question

 
 
Ben Edwards (lists)
Guest
Posts: n/a
 
      07-28-2006
I am using python 2.4 on Ubuntu dapper, I am working through Dive into
Python.

There are a couple of inconsictencies.

Firstly sys.setdefaultencoding('iso−8859−1') does not work, I have to do
sys.setdefaultencoding = 'iso−8859−1'

secondly the following does not give a 'UnicodeError: ASCII encoding
error:', and I would expect ti to. In fact it prints out the n with ~
above it fine:

sys.setdefaultencoding = 'ascii'
s = u'La Pe\xf1a'
print s

Any insight?
Ben


 
Reply With Quote
 
 
 
 
Steve M
Guest
Posts: n/a
 
      07-28-2006
Ben Edwards (lists) wrote:
> I am using python 2.4 on Ubuntu dapper, I am working through Dive into
> Python.
>
> There are a couple of inconsictencies.
>
> Firstly sys.setdefaultencoding('iso-8859-1') does not work, I have to do
> sys.setdefaultencoding = 'iso-8859-1'


When you run a Python script, the interpreter does some of its own
stuff before executing your script. One of the things it does is to
delete the name sys.setdefaultencoding. This means that by the time
even your first line of code runs that name no longer exists and so you
will be unable to invoke the function as in your first attempt.

The second attempt "sys.setdefaultencoding = 'iso-8859-1' " is creating
a new name under the sys namespace and assigning it a string. This will
not have the desired effect, or probably any effect at all.

I have found that in order to change the default encoding with that
function, you can put the command in a file called sitecustomize.py
which, when placed in the appropriate location (which is
platform-dependent), will be called in time to have the desired effect.

So the order of events is something like:
1. Invoke Python on myscript.py
2. Python does some stuff and then executes sitecustomize.py
3. Python deletes the name sys.setdefaultencoding, thereby making the
function that was so-named inaccessible.
4. Python then begins executing myscript.py.


Regarding the location of sitecustomize.py, on Windows it is
C:\Python24\Lib\sitecustomize.py.

My guess is that you should put it in the same directory as the bulk of
the Python standard library files. (Also in that directory is a
subdirectory called site-packages, where you can put custom modules
that will be available for import from any of your scripts.)

 
Reply With Quote
 
 
 
 
=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=
Guest
Posts: n/a
 
      07-28-2006
Ben Edwards (lists) wrote:
> Firstly sys.setdefaultencoding('iso−8859−1') does not work, I have to do
> sys.setdefaultencoding = 'iso−8859−1'


That "works", but has no effect. You bind the variable
sys.setdefaultencoding to some value, but that value is never used for
anything (do sys.getdefaultencoding() to see what I mean). You could
just as well write

sys.standardkodierung = 'iso-8859-1'

> secondly the following does not give a 'UnicodeError: ASCII encoding
> error:', and I would expect ti to. In fact it prints out the n with ~
> above it fine:
>
> sys.setdefaultencoding = 'ascii'
> s = u'La Pe\xf1a'
> print s
>
> Any insight?


The print statement uses sys.stdout.encoding, not the default encoding.

Regards,
Martin
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: os.lisdir, gets unicode, returns unicode... USUALLY?!?!? Jean-Paul Calderone Python 23 11-21-2006 10:25 AM
os.lisdir, gets unicode, returns unicode... USUALLY?!?!? gabor Python 13 11-18-2006 09:23 AM
Unicode digit to unicode string Gabriele *darkbard* Farina Python 2 05-16-2006 01:15 PM
unicode wrap unicode object? ygao Python 6 04-08-2006 09:54 AM
Unicode + jsp + mysql + tomcat = unicode still not displaying Robert Mark Bram Java 0 09-28-2003 05:37 AM



Advertisments