Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Parsing email attachments: get_payload() produces unsaveable data

Reply
Thread Tools

Parsing email attachments: get_payload() produces unsaveable data

 
 
dpapathanasiou
Guest
Posts: n/a
 
      10-04-2009
I'm using python to access an email account via POP, then for each
incoming message, save any attachments.

This is the function which scans the message for attachments:

def save_attachments (local_folder, msg_text):
"""Scan the email message text and save the attachments (if any)
in the local_folder"""
if msg_text:
for part in email.message_from_string(msg_text).walk():
if part.is_multipart() or part.get_content_maintype() ==
'text':
continue
filename = part.get_filename(None)
if filename:
filedata = part.get_payload(decode=True)
if filedata:
write_file(local_folder, filename, filedata)

All the way up to write_file(), it's working correctly.

The filename variable matches the name of the attached file, and the
filedata variable contains binary data corresponding to the file's
contents.

When I try to write the filedata to a file system folder, though, I
get an AttributeError in the stack trace.

Here is my write_file() function:

def write_file (folder, filename, f, chunk_size=4096):
"""Write the the file data f to the folder and filename
combination"""
result = False
if confirm_folder(folder):
try:
file_obj = open(os.path.join(folder, file_base_name
(filename)), 'wb', chunk_size)
for file_chunk in read_buffer(f, chunk_size):
file_obj.write(file_chunk)
file_obj.close()
result = True
except (IOError):
print "file_utils.write_file: could not write '%s' to
'%s'" % (file_base_name(filename), folder)
return result

I also tried applying this regex:

filedata = re.sub(r'\r(?!=\n)', '\r\n', filedata) # Bare \r becomes \r
\n

after reading this post (http://stackoverflow.com/questions/787739/
python-email-getpayload-decode-fails-when-hitting-equal-sign), but it
hasn't resolved the problem.

Is there any way of correcting the output of get_payload() so I can
save it to a file?
 
Reply With Quote
 
 
 
 
Albert Hopkins
Guest
Posts: n/a
 
      10-04-2009
On Sun, 2009-10-04 at 07:27 -0700, dpapathanasiou wrote:
> When I try to write the filedata to a file system folder, though, I
> get an AttributeError in the stack trace.


And where might we be able to see that stack trace?

-a

 
Reply With Quote
 
 
 
 
dpapathanasiou
Guest
Posts: n/a
 
      10-04-2009

> And where might we be able to see that stack trace?


This is it:

Exception: ('AttributeError', '<no args>', [' File "/opt/server/smtp/
smtps.py", line 213, in handle\n e
mail_replier.post_reply(recipient_mbox, \'\'.join(data))\n', ' File "/
opt/server/smtp/email_replier.py", l
ine 108, in post_reply\n save_attachments(result[2], msg_text)\n',
' File "/opt/server/smtp/email_repli
er.py", line 79, in save_attachments\n data_manager.upload_file
(item_id, filename, filedata)\n', ' File
"../db/data_manager.py", line 697, in upload_file\n if
docs_db.save_file(item_id, file_name, file_data)
:\n', ' File "../db/docs_db.py", line 102, in save_file\n result =
file_utils.write_file(saved_file_pat
h, saved_file_name + saved_file_ext, file_data)\n'])

If you're wondering, I'm using this to capture the exception:

def formatExceptionInfo(maxTBlevel=5):
"""For displaying exception information"""
cla, exc, trbk = sys.exc_info()
excName = cla.__name__
try:
excArgs = exc.__dict__["args"]
except KeyError:
excArgs = "<no args>"
excTb = traceback.format_tb(trbk, maxTBlevel)
return (excName, excArgs, excTb)

 
Reply With Quote
 
Albert Hopkins
Guest
Posts: n/a
 
      10-04-2009
On Sun, 2009-10-04 at 08:16 -0700, dpapathanasiou wrote:
> > And where might we be able to see that stack trace?

>
> This is it:
>
> Exception: ('AttributeError', '<no args>', [' File "/opt/server/smtp/
> smtps.py", line 213, in handle\n e
> mail_replier.post_reply(recipient_mbox, \'\'.join(data))\n', ' File "/
> opt/server/smtp/email_replier.py", l
> ine 108, in post_reply\n save_attachments(result[2], msg_text)\n',
> ' File "/opt/server/smtp/email_repli
> er.py", line 79, in save_attachments\n data_manager.upload_file
> (item_id, filename, filedata)\n', ' File
> "../db/data_manager.py", line 697, in upload_file\n if
> docs_db.save_file(item_id, file_name, file_data)
> :\n', ' File "../db/docs_db.py", line 102, in save_file\n result =
> file_utils.write_file(saved_file_pat
> h, saved_file_name + saved_file_ext, file_data)\n'])
>
> If you're wondering, I'm using this to capture the exception:
>
> def formatExceptionInfo(maxTBlevel=5):
> """For displaying exception information"""
> cla, exc, trbk = sys.exc_info()
> excName = cla.__name__
> try:
> excArgs = exc.__dict__["args"]
> except KeyError:
> excArgs = "<no args>"
> excTb = traceback.format_tb(trbk, maxTBlevel)
> return (excName, excArgs, excTb)
>


Which is *really* difficult (for me) to read. Any chance of providing a
"normal" traceback?


 
Reply With Quote
 
dpapathanasiou
Guest
Posts: n/a
 
      10-04-2009

> Which is *really* difficult (for me) to read. *Any chance of providing a
> "normal" traceback?


File "/opt/server/smtp/smtps.py", line 213, in handle
email_replier.post_reply(recipient_mbox, ''.join(data))
File "/opt/server/smtp/email_replier.py", line 108, in post_reply
save_attachments(result[2], msg_text)
File "/opt/server/smtp/email_replier.py", line 79, in
save_attachments
data_manager.upload_file(item_id, filename, filedata)
File "../db/data_manager.py", line 697, in upload_file
if docs_db.save_file(item_id, file_name, file_data):
File "../db/docs_db.py", line 102, in save_file
result = file_utils.write_file(saved_file_path, saved_file_name +
saved_file_ext, file_data)

AttributeError
 
Reply With Quote
 
Albert Hopkins
Guest
Posts: n/a
 
      10-04-2009
On Sun, 2009-10-04 at 09:17 -0700, dpapathanasiou wrote:
> > Which is *really* difficult (for me) to read. Any chance of providing a
> > "normal" traceback?

>
> File "/opt/server/smtp/smtps.py", line 213, in handle
> email_replier.post_reply(recipient_mbox, ''.join(data))
> File "/opt/server/smtp/email_replier.py", line 108, in post_reply
> save_attachments(result[2], msg_text)
> File "/opt/server/smtp/email_replier.py", line 79, in
> save_attachments
> data_manager.upload_file(item_id, filename, filedata)
> File "../db/data_manager.py", line 697, in upload_file
> if docs_db.save_file(item_id, file_name, file_data):
> File "../db/docs_db.py", line 102, in save_file
> result = file_utils.write_file(saved_file_path, saved_file_name +
> saved_file_ext, file_data)
>
> AttributeError


Are you sure this is the complete traceback? Usually an AttributeError
returns a text message such as:

AttributeError: foo has no such attribute bar

Also, the traceback says the exception happened in "save_file", but the
code you posted was a function called "save_attachments" and the
function call is different.

Would be nice if we could get the full traceback with the exact matching
code. Otherwise we have to make guesses. But I've given up. Perhaps
someone else is better off helping you.

-a


 
Reply With Quote
 
dpapathanasiou
Guest
Posts: n/a
 
      10-14-2009
On Oct 4, 10:27*am, dpapathanasiou <(E-Mail Removed)>
wrote:
> I'm using python to access an email account via POP, then for each
> incoming message, save any attachments.
>
> This is the function which scans the message for attachments:
>
> def save_attachments (local_folder, msg_text):
> * * """Scan the email message text and save the attachments (if any)
> in the local_folder"""
> * * if msg_text:
> * * * * for part in email.message_from_string(msg_text).walk():
> * * * * * * if part.is_multipart() or part.get_content_maintype() ==
> 'text':
> * * * * * * * * continue
> * * * * * * filename = part.get_filename(None)
> * * * * * * if filename:
> * * * * * * * * filedata = part.get_payload(decode=True)
> * * * * * * * * if filedata:
> * * * * * * * * * * write_file(local_folder, filename, filedata)
>
> All the way up to write_file(), it's working correctly.
>
> The filename variable matches the name of the attached file, and the
> filedata variable contains binary data corresponding to the file's
> contents.
>
> When I try to write the filedata to a file system folder, though, I
> get an AttributeError in the stack trace.
>
> Here is my write_file() function:
>
> def write_file (folder, filename, f, chunk_size=4096):
> * * """Write the the file data f to the folder and filename
> combination"""
> * * result = False
> * * if confirm_folder(folder):
> * * * * try:
> * * * * * * file_obj = open(os.path.join(folder, file_base_name
> (filename)), 'wb', chunk_size)
> * * * * * * for file_chunk in read_buffer(f, chunk_size):
> * * * * * * * * file_obj.write(file_chunk)
> * * * * * * file_obj.close()
> * * * * * * result = True
> * * * * except (IOError):
> * * * * * * print "file_utils.write_file: could not write '%s' to
> '%s'" % (file_base_name(filename), folder)
> * * return result
>
> I also tried applying this regex:
>
> filedata = re.sub(r'\r(?!=\n)', '\r\n', filedata) # Bare \r becomes \r
> \n
>
> after reading this post (http://stackoverflow.com/questions/787739/
> python-email-getpayload-decode-fails-when-hitting-equal-sign), but it
> hasn't resolved the problem.
>
> Is there any way of correcting the output of get_payload() so I can
> save it to a file?


An update for the record (and in case anyone else also has this
problem):

The regex suggested in the StackOverflow post (i.e., filedata = re.sub
(r'\r(?!=\n)', '\r\n', filedata) # Bare \r becomes \r\n) is necessary
but not sufficient.

It turns out that because get_payload() returns a binary stream, the
right way to save those bytes to a file is to use a function like
this:

def write_binary_file (folder, filename, filedata):
"""Write the binary file data to the folder and filename
combination"""
result = False
if confirm_folder(folder):
try:
file_obj = open(os.path.join(folder, file_base_name
(filename)), 'wb')
file_obj.write(filedata)
file_obj.close()
result = True
except (IOError):
print "file_utils.write_file: could not write '%s' to
'%s'" % (file_base_name(filename), folder)
return result

I.e., filedata, the output of get_payload(), can be written all at
once, w/o reading and writing in 4k chunks.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
OutputStream from a URLConnection produces an OutOfMemory OutputStream from a URLConnection produces an OutOfMemory WinstonSmith_101@hotmail.com Java 2 10-25-2006 04:45 PM
binding webcontrol to a C# method that produces string with xml data jason@cyberpine.com ASP .Net 0 09-22-2006 08:08 PM
Server.Transfer produces '404-not found' error only on one computer Stan ASP .Net 6 10-28-2004 02:21 AM
system() produces error. Jamie Ruff Perl 12 08-24-2004 05:05 PM
Entering new entry in Web.config produces app error Razak ASP .Net 3 07-19-2004 05:11 AM



Advertisments