![]() |
Python encoding question
Hi,
I'm doing my first steps with python and I have a problem with understanding an encoding problem I have. My script: import os os.environ["NLS_LANG"] = "German_Germany.UTF8" import cx_Oracle connection = cx_Oracle.Connection("username/password@SID") cursor = connection.cursor() cursor.execute("SELECT NAME1 FROM COR WHERE CORNB='ABCDEF'") TEST = cursor.fetchone() print TEST[0] print TEST When I run this script It prints me: München ('M\xc3\xbcnchen',) Why is the Umlaut of TEST[0] printed and not from TEST? And why are both prints show the wrong encoding, when I switch "fetchone()" to "fetchall()": ('M\xc3\xbcnchen',) [('M\xc3\xbcnchen',)] I'm running Python 2.4.3 on CentOS 5. Regards, Marc |
Re: Python encoding question
Marc Muehlfeld wrote:
> Hi, > > I'm doing my first steps with python and I have a problem with > understanding an encoding problem I have. My script: > > import os > os.environ["NLS_LANG"] = "German_Germany.UTF8" > import cx_Oracle > connection = cx_Oracle.Connection("username/password@SID") > cursor = connection.cursor() > cursor.execute("SELECT NAME1 FROM COR WHERE CORNB='ABCDEF'") > TEST = cursor.fetchone() > print TEST[0] > print TEST > > > When I run this script It prints me: > München > ('M\xc3\xbcnchen',) > > Why is the Umlaut of TEST[0] printed and not from TEST? > > > And why are both prints show the wrong encoding, when I switch > "fetchone()" to "fetchall()": > ('M\xc3\xbcnchen',) > [('M\xc3\xbcnchen',)] > > > I'm running Python 2.4.3 on CentOS 5. > > > Regards, > Marc Nothing related to encoding here. TEST[0] is a string, TEST is a tupple. s1 = 'aline \n anotherline' > print str(s1) aline anotherline > print repr(s1) 'aline \n anotherline' atuple = (s1,) > print str(atuple) ('aline \n anotherline',) > print repr(atuple) ('aline \n anotherline',) Read http://docs.python.org/reference/datamodel.html regarding __repr__ and __str__. Basically, __str__ and __repr__ are the same method for tuples, while it differs from each other for strings. If you want a nice representation of tuple elements you have to do it yourself: print ', '.join([str(elem) for elem in atuple]) In a more general manner only strings will print nicely with carriage returns & UTF8 characters. Everyhing else, like tuple, lists, objects will using the __repr__ method which displays formal data. JM PS : > class Foo: def __str__(self): return 'I am a nice representation of a Foo instance' > print Foo() I am a nice representation of a Foo instance > print str(Foo()) I am a nice representation of a Foo instance > print repr(Foo()) <__main__.Foo instance at 0xb73a07ac> |
Re: Python encoding question
On 01/-10/-28163 02:59 PM, Marc Muehlfeld wrote:
> Hi, > > <snip> > TEST = cursor.fetchone() > print TEST[0] > print TEST > > > When I run this script It prints me: > München > ('M\xc3\xbcnchen',) > > Why is the Umlaut of TEST[0] printed and not from TEST? > When you print a string, it simply prints it, control characters, international characters, and all. When you print a more complex object, it's up to that object to decide how to print. In the case of a tuple above, the tuple logic displays the parentheses and the comma, but calls the repr() of any objects it contains. Tuple doesn't make a special case for strings, or for numbers, it just always calls repr() (actually it's __repr__(), I think) A list does the same thing, though it'll use square brackets at the ends. So the question boils down to what repr() does. It attempts to create a representation that could be used to create the specific object. So if there's a newline, it uses \n. And if there are non-ASCII codes, it uses hex escape sequences. And of course it adds the quote marks. DaveA |
| All times are GMT. The time now is 05:58 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.