Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > UnicodeDecodeError: 'ascii' codec can't decode byte

Reply
Thread Tools

UnicodeDecodeError: 'ascii' codec can't decode byte

 
 
Gilles Ganault
Guest
Posts: n/a
 
      06-17-2008
Hello

It seems like I have Unicode data in a CSV file but Python is using
a different code page, so isn't happy when I'm trying to read and put
this data into an SQLite database with APSW:

========
sql = "INSERT INTO mytable (col1,col2) VALUES (?,?)"
cursor.executemany(sql, records("test.tsv"))
"""
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc9 in position
18: ordinal not in range(12
"""
========

What should I do so Python doesn't raise this error? Should I convert
data in the CVS file, or is there some function that I should call
before APSW's executemany()?

Thank you.
 
Reply With Quote
 
 
 
 
Peter Otten
Guest
Posts: n/a
 
      06-17-2008
Gilles Ganault wrote:

> It seems like I have Unicode data in a CSV file but Python is using
> a different code page, so isn't happy when I'm trying to read and put
> this data into an SQLite database with APSW:


My guess is that you have non-ascii characters in a bytestring.

> What should I do so Python doesn't raise this error? Should I convert
> data in the CVS file, or is there some function that I should call
> before APSW's executemany()?


You cannot have unicode data in a file, only unicode converted to
bytestrings using some encoding. Assuming that encoding is UTF-8 and that
apsw can cope with unicode, try to convert your data to unicode before
feeding it to the database api:

> sql = "INSERT INTO mytable (col1,col2) VALUES (?,?)"


rows = ([col.decode("utf-8") for col in row] for row in
records("test.tsv"))
cursor.executemany(sql, rows)

Peter
 
Reply With Quote
 
 
 
 
Gilles Ganault
Guest
Posts: n/a
 
      06-17-2008
On Tue, 17 Jun 2008 09:23:28 +0200, Peter Otten <(E-Mail Removed)>
wrote:
> Assuming that encoding is UTF-8 and that apsw can cope
> with unicode, try to convert your data to unicode before
> feeding it to the database api:
>
>> sql = "INSERT INTO mytable (col1,col2) VALUES (?,?)"

>
> rows = ([col.decode("utf-8") for col in row] for row in
>records("test.tsv"))
> cursor.executemany(sql, rows)


Thanks again.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in Anjanesh Lekshminarayanan Python 0 01-29-2009 04:24 PM
[2.5.1] "UnicodeDecodeError: 'ascii' codec can't decode byte"? Gilles Ganault Python 3 10-29-2008 11:29 AM
Re: UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in Gabriel Genellina Python 0 10-21-2008 08:00 AM
Long way around UnicodeDecodeError, or 'ascii' codec can't decode byte Oleg Parashchenko Python 4 03-31-2007 05:07 PM
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 10: ordinal not in range(128) Robin Siebler Python 4 10-08-2004 08:03 PM



Advertisments