Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Literal Escaped Octets

Reply
Thread Tools

Literal Escaped Octets

 
 
Chason Hayes
Guest
Posts: n/a
 
      02-06-2006
I am trying to convert raw binary data to data with escaped octets in
order to store it in a bytea field on postgresql server. I could do this
easily in c/c++ but I need to do it in python. I am not sure how to read
and evaluate the binary value of a byte in a long string when it is a non
printable ascii value in python. I read some ways to use unpack from the
struct module, but i really couldn't understand where that would help. I
looked at the MIMIEncode module but I don't know how to convert the object
to a string. Is there a module that will convert the data? It seems to me
that this question must have been answered a million times before but I
can't find anything.



See http://www.postgresql.org/docs/8.1/i...pe-binary.html
for a description of the problem domain.


 
Reply With Quote
 
 
 
 
Alex Martelli
Guest
Posts: n/a
 
      02-06-2006
Chason Hayes <(E-Mail Removed)> wrote:
...
> easily in c/c++ but I need to do it in python. I am not sure how to read
> and evaluate the binary value of a byte in a long string when it is a non
> printable ascii value in python.


If you have a bytestring (AKA plain string) s, the binary value of its
k-th byte is ord(s[k]).


Alex
 
Reply With Quote
 
 
 
 
Steve Holden
Guest
Posts: n/a
 
      02-06-2006
Chason Hayes wrote:
> I am trying to convert raw binary data to data with escaped octets in
> order to store it in a bytea field on postgresql server. I could do this
> easily in c/c++ but I need to do it in python. I am not sure how to read
> and evaluate the binary value of a byte in a long string when it is a non
> printable ascii value in python. I read some ways to use unpack from the
> struct module, but i really couldn't understand where that would help. I
> looked at the MIMIEncode module but I don't know how to convert the object
> to a string. Is there a module that will convert the data? It seems to me
> that this question must have been answered a million times before but I
> can't find anything.
>
>
>
> See http://www.postgresql.org/docs/8.1/i...pe-binary.html
> for a description of the problem domain.
>
>

The URL you reference is discussing how you represent arbitrary values
in string literals. If you already have the data in a Python string the
best advise is to use a parameterized query - that way your Python DB
API module will do the escaping for you!

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

 
Reply With Quote
 
Chason Hayes
Guest
Posts: n/a
 
      02-06-2006
On Mon, 06 Feb 2006 13:39:17 +0000, Steve Holden wrote:

> Chason Hayes wrote:
>> I am trying to convert raw binary data to data with escaped octets in
>> order to store it in a bytea field on postgresql server. I could do this
>> easily in c/c++ but I need to do it in python. I am not sure how to read
>> and evaluate the binary value of a byte in a long string when it is a non
>> printable ascii value in python. I read some ways to use unpack from the
>> struct module, but i really couldn't understand where that would help. I
>> looked at the MIMIEncode module but I don't know how to convert the object
>> to a string. Is there a module that will convert the data? It seems to me
>> that this question must have been answered a million times before but I
>> can't find anything.
>>
>>
>>
>> See http://www.postgresql.org/docs/8.1/i...pe-binary.html
>> for a description of the problem domain.
>>
>>

> The URL you reference is discussing how you represent arbitrary values
> in string literals. If you already have the data in a Python string the
> best advise is to use a parameterized query - that way your Python DB
> API module will do the escaping for you!
>
> regards
> Steve


Thanks for the input. I tried that with a format string and a
dictionary, but I still received a database error indicating illegal
string values. This error went away completely when I used a test file
consisting only of text, but reproduced everytime with a true binary file.
If you can let me know where I am wrong or show me a code snippet with a
sql insert that contains a variable with raw binary data that works,
I would greatly appreciate it.

Chason

 
Reply With Quote
 
Chason Hayes
Guest
Posts: n/a
 
      02-06-2006
On Sun, 05 Feb 2006 21:07:23 -0800, Alex Martelli wrote:

> Chason Hayes <(E-Mail Removed)> wrote:
> ...
>> easily in c/c++ but I need to do it in python. I am not sure how to read
>> and evaluate the binary value of a byte in a long string when it is a non
>> printable ascii value in python.

>
> If you have a bytestring (AKA plain string) s, the binary value of its
> k-th byte is ord(s[k]).
>
>
> Alex


Thank you very much, That is the function that I was looking for to write
a filter.

Chason

 
Reply With Quote
 
Steve Holden
Guest
Posts: n/a
 
      02-07-2006
Chason Hayes wrote:
> On Mon, 06 Feb 2006 13:39:17 +0000, Steve Holden wrote:

[...]
>>
>>The URL you reference is discussing how you represent arbitrary values
>>in string literals. If you already have the data in a Python string the
>>best advise is to use a parameterized query - that way your Python DB
>>API module will do the escaping for you!
>>
>>regards
>> Steve

>
>
> Thanks for the input. I tried that with a format string and a
> dictionary, but I still received a database error indicating illegal
> string values. This error went away completely when I used a test file
> consisting only of text, but reproduced everytime with a true binary file.
> If you can let me know where I am wrong or show me a code snippet with a
> sql insert that contains a variable with raw binary data that works,
> I would greatly appreciate it.
>

I tried and my experience was exactly the same, which made me think less
of PostgreSQL.

They don't seem to implement the SQL BLOB type properly, so it looks as
though that rebarbative syntax with all the backslashes is necessary. Sorry.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

 
Reply With Quote
 
Bengt Richter
Guest
Posts: n/a
 
      02-07-2006
On Mon, 06 Feb 2006 04:40:31 GMT, Chason Hayes <(E-Mail Removed)> wrote:

>I am trying to convert raw binary data to data with escaped octets in
>order to store it in a bytea field on postgresql server. I could do this
>easily in c/c++ but I need to do it in python. I am not sure how to read
>and evaluate the binary value of a byte in a long string when it is a non
>printable ascii value in python. I read some ways to use unpack from the
>struct module, but i really couldn't understand where that would help. I
>looked at the MIMIEncode module but I don't know how to convert the object
>to a string. Is there a module that will convert the data? It seems to me
>that this question must have been answered a million times before but I
>can't find anything.
>

Have you considered just encoding the data as text in hex or base64, e.g.,

>>> import binascii
>>> s = '\x00\x01\x02\x03ABCD0123'
>>> binascii.hexlify(s)

'000102034142434430313233'
>>> binascii.b2a_base64(s)

'AAECA0FCQ0QwMTIz\n'

which is also reversible later of course:
>>> h = binascii.hexlify(s)
>>> binascii.unhexlify(h)

'\x00\x01\x02\x03ABCD0123'
>>> b64 = binascii.b2a_base64(s)
>>> binascii.a2b_base64(b64)

'\x00\x01\x02\x03ABCD0123'

Regards,
Bengt Richter
 
Reply With Quote
 
Chason Hayes
Guest
Posts: n/a
 
      02-08-2006
On Tue, 07 Feb 2006 15:06:49 +0000, Bengt Richter wrote:

> On Mon, 06 Feb 2006 04:40:31 GMT, Chason Hayes <(E-Mail Removed)> wrote:
>
>>I am trying to convert raw binary data to data with escaped octets in
>>order to store it in a bytea field on postgresql server. I could do this
>>easily in c/c++ but I need to do it in python. I am not sure how to read
>>and evaluate the binary value of a byte in a long string when it is a non
>>printable ascii value in python. I read some ways to use unpack from the
>>struct module, but i really couldn't understand where that would help. I
>>looked at the MIMIEncode module but I don't know how to convert the object
>>to a string. Is there a module that will convert the data? It seems to me
>>that this question must have been answered a million times before but I
>>can't find anything.
>>

> Have you considered just encoding the data as text in hex or base64, e.g.,
>
> >>> import binascii
> >>> s = '\x00\x01\x02\x03ABCD0123'
> >>> binascii.hexlify(s)

> '000102034142434430313233'
> >>> binascii.b2a_base64(s)

> 'AAECA0FCQ0QwMTIz\n'
>
> which is also reversible later of course:
> >>> h = binascii.hexlify(s)
> >>> binascii.unhexlify(h)

> '\x00\x01\x02\x03ABCD0123'
> >>> b64 = binascii.b2a_base64(s)
> >>> binascii.a2b_base64(b64)

> '\x00\x01\x02\x03ABCD0123'
>
> Regards,
> Bengt Richter


I had just about come to that conclusion last night while I was working on
it. I was going to use
import base64
base64.stringencode(binarydata)
and
base64.stringdecode(stringdata)

I then wasn't sure if I should still use the bytea field or just use a
text field.

Do you have a suggestion?

 
Reply With Quote
 
Chason Hayes
Guest
Posts: n/a
 
      02-08-2006
On Tue, 07 Feb 2006 01:58:00 +0000, Steve Holden wrote:

> Chason Hayes wrote:
>> On Mon, 06 Feb 2006 13:39:17 +0000, Steve Holden wrote:

> [...]
>>>
>>>The URL you reference is discussing how you represent arbitrary values
>>>in string literals. If you already have the data in a Python string the
>>>best advise is to use a parameterized query - that way your Python DB
>>>API module will do the escaping for you!
>>>
>>>regards
>>> Steve

>>
>>
>> Thanks for the input. I tried that with a format string and a
>> dictionary, but I still received a database error indicating illegal
>> string values. This error went away completely when I used a test file
>> consisting only of text, but reproduced everytime with a true binary file.
>> If you can let me know where I am wrong or show me a code snippet with a
>> sql insert that contains a variable with raw binary data that works,
>> I would greatly appreciate it.
>>

> I tried and my experience was exactly the same, which made me think less
> of PostgreSQL.
>
> They don't seem to implement the SQL BLOB type properly, so it looks as
> though that rebarbative syntax with all the backslashes is necessary. Sorry.
>
> regards
> Steve


with regards to escaping data parameters I have found that I have to
specifically add quotes to my strings for them to be understood by
pstgresql. For example

ifs=open("binarydatafile","r")
binarydata=ifs.read()
stringdata=base64.encodestring(binarydata)

#does not work
cursor.execute("insert into binarytable values(%s)" % stringdata)

#need to do this first
newstringdata = "'" + stringdata + "'"

then the select statment works.
Is this expected behavior? Is there a better way of doing this?

thanks for any insight
Chason


 
Reply With Quote
 
Steve Holden
Guest
Posts: n/a
 
      02-08-2006
Chason Hayes wrote:
> On Tue, 07 Feb 2006 01:58:00 +0000, Steve Holden wrote:
>
>
>>Chason Hayes wrote:
>>
>>>On Mon, 06 Feb 2006 13:39:17 +0000, Steve Holden wrote:

>>
>>[...]
>>
>>>>The URL you reference is discussing how you represent arbitrary values
>>>>in string literals. If you already have the data in a Python string the
>>>>best advise is to use a parameterized query - that way your Python DB
>>>>API module will do the escaping for you!
>>>>
>>>>regards
>>>> Steve
>>>
>>>
>>>Thanks for the input. I tried that with a format string and a
>>>dictionary, but I still received a database error indicating illegal
>>>string values. This error went away completely when I used a test file
>>>consisting only of text, but reproduced everytime with a true binary file.
>>>If you can let me know where I am wrong or show me a code snippet with a
>>>sql insert that contains a variable with raw binary data that works,
>>>I would greatly appreciate it.
>>>

>>
>>I tried and my experience was exactly the same, which made me think less
>>of PostgreSQL.
>>
>>They don't seem to implement the SQL BLOB type properly, so it looks as
>>though that rebarbative syntax with all the backslashes is necessary. Sorry.
>>
>>regards
>> Steve

>
>
> with regards to escaping data parameters I have found that I have to
> specifically add quotes to my strings for them to be understood by
> pstgresql. For example
>
> ifs=open("binarydatafile","r")
> binarydata=ifs.read()
> stringdata=base64.encodestring(binarydata)
>
> #does not work
> cursor.execute("insert into binarytable values(%s)" % stringdata)
>
> #need to do this first
> newstringdata = "'" + stringdata + "'"
>
> then the select statment works.
> Is this expected behavior? Is there a better way of doing this?
>
> thanks for any insight


Yes, parameterize your queries. I assume you are using psycopg or
something similar to create the database connection (i.e. I something
that expects the "%s" parameter style - there are other options, but we
needn't discuss them here).

The magic incantation you seek is:

cursor.execute("insert into binarytable values(%s)", (stringdata, ))

Note that here there are TWO arguments to the .execute() method. The
first is a parameterized SQL statement, and the second is a tuple of
data items, one for each parameter mark in the SQL.

Using this technique all necessary quoting (and even data conversion
with a good database module) is performed inside the database driver,
meaning (among other things) that your program is no longer vulnerable
to the dreaded SQL injection errors.

This is the technique I was hoping would work with the bytea datatype,
but alas it doesn't. ISTM that PostgreSQL needs a bit of work there,
even though it is otherwise a very polished product.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: illegal line end in character literal in escaped unicode CARRIAGERETURN / NEW LINE Joshua Cranmer Java 0 05-15-2009 02:50 PM
Re: illegal line end in character literal in escaped unicode CARRIAGERETURN / NEW LINE Lew Java 0 05-15-2009 02:38 PM
Re: illegal line end in character literal in escaped unicode CARRIAGERETURN / NEW LINE Mark Space Java 0 05-15-2009 02:35 PM
Re: illegal line end in character literal in escaped unicode CARRIAGE RETURN / NEW LINE Andreas Leitgeb Java 0 05-15-2009 02:02 PM
cannot calculate octets davide.papagno@gmail.com Cisco 9 07-17-2006 11:16 AM



Advertisments