Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Extract an image from a RTF file

Reply
Thread Tools

Extract an image from a RTF file

 
 
Bryan.Fodness@gmail.com
Guest
Posts: n/a
 
      02-14-2009
I have a large amount of RTF files where the only thing in them is an
image. I would like to extract them an save them as a png.
Eventually, I would like to also grab some text that is on the image.
I think PIL has something for this.

Does anyone have any suggestion on how to start this?
 
Reply With Quote
 
 
 
 
Terry Reedy
Guest
Posts: n/a
 
      02-14-2009
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> I have a large amount of RTF files where the only thing in them is an
> image. I would like to extract them an save them as a png.
> Eventually, I would like to also grab some text that is on the image.
> I think PIL has something for this.
>
> Does anyone have any suggestion on how to start this?


Wikepedia Rich Text Format has several links, which lead to
http://pyrtf.sourceforge.net/
http://code.google.com/p/pyrtf-ng/
The former says rtf generation, including images.
The latter says rtf generation and parsing, but only claims to be a
rewrite of the former.

 
Reply With Quote
 
 
 
 
Curt Hash
Guest
Posts: n/a
 
      02-14-2009
On Sat, Feb 14, 2009 at 11:01 AM, Terry Reedy <(E-Mail Removed)> wrote:
>
> (E-Mail Removed) wrote:
>>
>> I have a large amount of RTF files where the only thing in them is an
>> image. I would like to extract them an save them as a png.
>> Eventually, I would like to also grab some text that is on the image.
>> I think PIL has something for this.
>>
>> Does anyone have any suggestion on how to start this?

>
> Wikepedia Rich Text Format has several links, which lead to
> http://pyrtf.sourceforge.net/
> http://code.google.com/p/pyrtf-ng/
> The former says rtf generation, including images.
> The latter says rtf generation and parsing, but only claims to be a rewrite of the former.
>
> --
> http://mail.python.org/mailman/listinfo/python-list


I've written an RTF parser in Python before, but for the purpose of
filtering and discarding content rather than extracting it.

Take a look at the specification here:
http://www.microsoft.com/downloads/d...displaylang=en

You will find that images are specified by one or more RTF control
words followed by a long string of hex data. For this special purpose,
you will not need to write a parser for the entire specification. Just
search the file for the correct sequence of control words, extract the
hex data that follows, and save it to a file.

It helps if you open the RTF document in a text editor and locate the
specific control group that contains the image, as the format and
order of control words varies depending on the application that
created it. If all of your documents are created with the same
application, it will be much easier.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: what do you think on my summary on C, for 13-15 years studentin .rtf format [to cut and paste in a .rtf file] Kleuskes & Moos C Programming 8 10-13-2011 08:51 AM
How do i extract vidios when winrar wont extract them??? help plzzzzzzzz smuttdog@sc.rr.com Computer Support 2 12-23-2007 07:03 AM
need create RTF-type file and DOC-type file keal Ruby 1 01-05-2006 04:12 PM
RTF Image NuBBeR C++ 1 12-09-2004 10:09 PM
Doing a 'mail merge' with RTF files (aka RTF templates) Tony Perl Misc 2 08-27-2003 08:12 AM



Advertisments