Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position

Reply
Thread Tools

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position

 
 
iMath
Guest
Posts: n/a
 
      12-06-2012
the following code originally from http://zetcode.com/databases/mysqlpythontutorial/
within the "Writing images" part .


import MySQLdb as mdb
import sys

try:
fin = open("Chrome_Logo.svg.png",'rb')
img = fin.read()
fin.close()

except IOError as e:

print ("Error %d: %s" % (e.args[0],e.args[1]))
sys.exit(1)


try:
conn = mdb.connect(host='localhost',user='testuser',
passwd='test623', db='testdb')
cursor = conn.cursor()
cursor.execute("INSERT INTO Images SET Data='%s'" % \
mdb.escape_string(img))

conn.commit()

cursor.close()
conn.close()

except mdb.Error as e:

print ("Error %d: %s" % (e.args[0],e.args[1]))
sys.exit(1)


I port it to python 3 ,and also change
fin = open("chrome.png")
to
fin = open("Chrome_Logo.png",'rb')
but when I run it ,it gives the following error :

Traceback (most recent call last):
File "E:\Python\py32\itest4.py", line 20, in <module>
mdb.escape_string(img))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

so how to fix it ?
 
Reply With Quote
 
 
 
 
Terry Reedy
Guest
Posts: n/a
 
      12-06-2012
On 12/6/2012 5:07 AM, iMath wrote:
> the following code originally from http://zetcode.com/databases/mysqlpythontutorial/
> within the "Writing images" part .
>
>
> import MySQLdb as mdb


Not part of stdlib. 'MySQLdb' should be in the subject line to get
attention of someone who is familiar with it. I am not.

> import sys
>
> try:
> fin = open("Chrome_Logo.svg.png",'rb')
> img = fin.read()
> fin.close()
>
> except IOError as e:
>
> print ("Error %d: %s" % (e.args[0],e.args[1]))
> sys.exit(1)
>
>
> try:
> conn = mdb.connect(host='localhost',user='testuser',
> passwd='test623', db='testdb')
> cursor = conn.cursor()
> cursor.execute("INSERT INTO Images SET Data='%s'" % \
> mdb.escape_string(img))


From the name, I would expect that excape_string expects text. From the
error, it seems to specifically expect utf-8 encoded bytes. After
decoding, I expect that it does some sort of 'escaping'. An image does
not qualify as that sort of input. If escape_string takes an encoding
arg, latin1 *might* work.

> conn.commit()
>
> cursor.close()
> conn.close()
>
> except mdb.Error as e:
>
> print ("Error %d: %s" % (e.args[0],e.args[1]))
> sys.exit(1)
>
>
> I port it to python 3 ,and also change
> fin = open("chrome.png")
> to
> fin = open("Chrome_Logo.png",'rb')
> but when I run it ,it gives the following error :
>
> Traceback (most recent call last):
> File "E:\Python\py32\itest4.py", line 20, in <module>
> mdb.escape_string(img))
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
>
> so how to fix it ?
>



--
Terry Jan Reedy

 
Reply With Quote
 
 
 
 
Hans Mulder
Guest
Posts: n/a
 
      12-06-2012
On 6/12/12 11:07:51, iMath wrote:
> the following code originally from http://zetcode.com/databases/mysqlpythontutorial/
> within the "Writing images" part .
>
>
> import MySQLdb as mdb
> import sys
>
> try:
> fin = open("Chrome_Logo.svg.png",'rb')
> img = fin.read()
> fin.close()
>
> except IOError as e:
>
> print ("Error %d: %s" % (e.args[0],e.args[1]))
> sys.exit(1)
>
>
> try:
> conn = mdb.connect(host='localhost',user='testuser',
> passwd='test623', db='testdb')
> cursor = conn.cursor()
> cursor.execute("INSERT INTO Images SET Data='%s'" % \
> mdb.escape_string(img))


You shouldn't call mdb.escape_string directly. Instead, you
should put placeholders in your SQL statement and let MySQLdb
figure out how to properly escape whatever needs escaping.

Somewhat confusingly, placeholders are written as %s in MySQLdb.
They differ from strings in not being enclosed in quotes.
The other difference is that you'd provide two arguments to
cursor.execute; the second of these is a tuple; in this case
a tuple with only one element:

cursor.execute("INSERT INTO Images SET Data=%s", (img,))

> conn.commit()
>
> cursor.close()
> conn.close()
>
> except mdb.Error as e:
>
> print ("Error %d: %s" % (e.args[0],e.args[1]))
> sys.exit(1)
>
>
> I port it to python 3 ,and also change
> fin = open("chrome.png")
> to
> fin = open("Chrome_Logo.png",'rb')
> but when I run it ,it gives the following error :
>
> Traceback (most recent call last):
> File "E:\Python\py32\itest4.py", line 20, in <module>
> mdb.escape_string(img))
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
>
> so how to fix it ?


Python 3 distinguishes between binary data and Unicode text.
Trying to apply string functions to images or other binary
data won't work.

Maybe correcting this bytes/strings confusion and porting
to Python 3 in one go is too large a transformation. In
that case, your best bet would be to go back to Python 2
and fix all the bytes/string confusion there. When you've
got it working again, you may be ready to port to Python 3.


Hope this helps,

-- HansM

 
Reply With Quote
 
Steven D'Aprano
Guest
Posts: n/a
 
      12-06-2012
On Thu, 06 Dec 2012 02:07:51 -0800, iMath wrote:

> the following code originally from
> http://zetcode.com/databases/mysqlpythontutorial/ within the "Writing
> images" part .
>
>
> import MySQLdb as mdb
> import sys
>
> try:
> fin = open("Chrome_Logo.svg.png",'rb')
> img = fin.read()
> fin.close()
> except IOError as e:
> print ("Error %d: %s" % (e.args[0],e.args[1]))
> sys.exit(1)


Every time a programmer catches an exception, only to merely print a
vague error message and then exit, God kills a kitten. Please don't do
that.

If all you are going to do is print an error message and then exit,
please don't bother. All you do is make debugging harder. When Python
detects an error, by default it prints a full traceback, which gives you
lots of information to track down the error. By catching that exception
as you do, you lose that information and make it harder to debug.

Moving on to the next thing:


[snip code]
> I port it to python 3 ,and also change fin = open("chrome.png")
> to
> fin = open("Chrome_Logo.png",'rb')
> but when I run it ,it gives the following error :
>
> Traceback (most recent call last):
> File "E:\Python\py32\itest4.py", line 20, in <module>
> mdb.escape_string(img))
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0:
> invalid start byte
>
> so how to fix it ?


I suggest you start by reading the documentation for
MySQLdb.escape_string. What does it do? What does it expect? A byte
string or a unicode text string?

It seems very strange to me that you are reading a binary file, then
passing it to something which appears to be expecting a string. It looks
like what happens is that the PNG image starts with a 0x89 byte, and the
escape_string function tries to decode those bytes into Unicode text:

py> img = b"\x89\x00\x23\xf2" # fake PNG binary data
py> img.decode('utf-8') # I'm expecting text
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0:
invalid start byte

Without knowing more about escape_string, I can only make a wild guess.
Try this:

import base64
img = fin.read() # read the binary data of the PNG file
data = base64.encodebytes(img) # turn the binary image into text
cursor.execute("INSERT INTO Images SET Data='%s'" % \
mdb.escape_string(data))


and see what that does.


--
Steven
 
Reply With Quote
 
iMath
Guest
Posts: n/a
 
      12-07-2012
在 2012年12月6日星期四UTC+8下午7时07分35秒 ,Hans Mulder写道:
> On 6/12/12 11:07:51, iMath wrote:
>
> > the following code originally from http://zetcode.com/databases/mysqlpythontutorial/

>
> > within the "Writing images" part .

>
> >

>
> >

>
> > import MySQLdb as mdb

>
> > import sys

>
> >

>
> > try:

>
> > fin = open("Chrome_Logo.svg.png",'rb')

>
> > img = fin.read()

>
> > fin.close()

>
> >

>
> > except IOError as e:

>
> >

>
> > print ("Error %d: %s" % (e.args[0],e.args[1]))

>
> > sys.exit(1)

>
> >

>
> >

>
> > try:

>
> > conn = mdb.connect(host='localhost',user='testuser',

>
> > passwd='test623', db='testdb')

>
> > cursor = conn.cursor()

>
> > cursor.execute("INSERT INTO Images SET Data='%s'" % \

>
> > mdb.escape_string(img))

>
>
>
> You shouldn't call mdb.escape_string directly. Instead, you
>
> should put placeholders in your SQL statement and let MySQLdb
>
> figure out how to properly escape whatever needs escaping.
>
>
>
> Somewhat confusingly, placeholders are written as %s in MySQLdb.
>
> They differ from strings in not being enclosed in quotes.
>
> The other difference is that you'd provide two arguments to
>
> cursor.execute; the second of these is a tuple; in this case
>
> a tuple with only one element:
>
>
>
> cursor.execute("INSERT INTO Images SET Data=%s", (img,))
>
>

thanks,but it still doesn't work
>
> > conn.commit()

>
> >

>
> > cursor.close()

>
> > conn.close()

>
> >

>
> > except mdb.Error as e:

>
> >

>
> > print ("Error %d: %s" % (e.args[0],e.args[1]))

>
> > sys.exit(1)

>
> >

>
> >

>
> > I port it to python 3 ,and also change

>
> > fin = open("chrome.png")

>
> > to

>
> > fin = open("Chrome_Logo.png",'rb')

>
> > but when I run it ,it gives the following error :

>
> >

>
> > Traceback (most recent call last):

>
> > File "E:\Python\py32\itest4.py", line 20, in <module>

>
> > mdb.escape_string(img))

>
> > UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0:invalid start byte

>
> >

>
> > so how to fix it ?

>
>
>
> Python 3 distinguishes between binary data and Unicode text.
>
> Trying to apply string functions to images or other binary
>
> data won't work.
>
>
>
> Maybe correcting this bytes/strings confusion and porting
>
> to Python 3 in one go is too large a transformation. In
>
> that case, your best bet would be to go back to Python 2
>
> and fix all the bytes/string confusion there. When you've
>
> got it working again, you may be ready to port to Python 3.
>
>
>
>
>
> Hope this helps,
>
>
>
> -- HansM


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[2.5.1] "UnicodeDecodeError: 'ascii' codec can't decode byte"? Gilles Ganault Python 3 10-29-2008 11:29 AM
Re: UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in Gabriel Genellina Python 0 10-21-2008 08:00 AM
UnicodeDecodeError: 'ascii' codec can't decode byte Gilles Ganault Python 2 06-17-2008 09:09 PM
Long way around UnicodeDecodeError, or 'ascii' codec can't decode byte Oleg Parashchenko Python 4 03-31-2007 05:07 PM
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 10: ordinal not in range(128) Robin Siebler Python 4 10-08-2004 08:03 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57