Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   Is there a library to parse Mozilla "mork" documents? (http://www.velocityreviews.com/forums/t340536-is-there-a-library-to-parse-mozilla-mork-documents.html)

John Reese 01-21-2005 12:19 AM

Is there a library to parse Mozilla "mork" documents?
 
Mozilla, Firefox, Thunderbird, and so forth use this awful format
called MORK to store all kinds of things: which messages you've read
in a newsgroup, headers and indexes into the mbox file of messages in
a mail folder, and address books. It's documented to some extent
here:
http://www.mozilla.org/mailnews/arch/mork/primer.txt

Does anyone know of a Python library for parsing these files? A
single file basically just stores the equivalent of a nested
dictionary with text that can be declared separately and interpolated.
jwz has an over-specific perl version at
http://www.jwz.org/hacks/marginal.html, which I might have to try to
translate if there's nothing already available in Python.

Peter Rowell 01-21-2005 02:30 AM

Re: Is there a library to parse Mozilla "mork" documents?
 
John Reese wrote:
> Mozilla, Firefox, Thunderbird, and so forth use this awful format
> called MORK to store all kinds of things: which messages you've read

[ snip ]

I was searching on a similar question (about accessing the history)
when I came across a nifty little bookmarklet. It dumps FF history in
RDF format to the file of your choice. This temporarily solved
my problem, although in the long run I want to have direct read
access to the info.

Perhaps you can get a few ideas and go from there. The bookmarlet
was attached to Bugzilla item 241438.
https://bugzilla.mozilla.org/show_bug.cgi?id=241438

HTH,
Peter

Tim Roberts 01-21-2005 07:48 AM

Re: Is there a library to parse Mozilla "mork" documents?
 
John Reese <jtr@ofb.net> wrote:
>
>Mozilla, Firefox, Thunderbird, and so forth use this awful format
>called MORK to store all kinds of things: which messages you've read
>in a newsgroup, headers and indexes into the mbox file of messages in
>a mail folder, and address books.


Yes. What a crock that is. The MORK format is a great way to compress
tabular information, IF the information consists of the same pieces of data
over and over. E-mail boxes do not fit into that class, so I have no doubt
that the typical Thunderbird MORK file is singificantly LARGER than the
same file would be in, say, INI format.

I wrote a Python script to parse it, but it isn't terribly robust. I was
able to produce a dictionary, but I didn't do anything with the results.
You're welcome to take a look:
http://www.probo.com/timr/parsemsf.py
--
- Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

John Reese 01-21-2005 09:52 PM

Re: Is there a library to parse Mozilla "mork" documents?
 
On Thu, 20 Jan 2005 23:48:34 -0800, Tim Roberts <timr@probo.com> wrote:
> John Reese <jtr@ofb.net> wrote:
>>
>>Mozilla, Firefox, Thunderbird, and so forth use this awful format
>>called MORK to store all kinds of things: which messages you've read
>>in a newsgroup, headers and indexes into the mbox file of messages in
>>a mail folder, and address books.

>
> Yes. What a crock that is. The MORK format is a great way to compress
> tabular information, IF the information consists of the same pieces of data
> over and over. E-mail boxes do not fit into that class, so I have no doubt
> that the typical Thunderbird MORK file is singificantly LARGER than the
> same file would be in, say, INI format.
>
> I wrote a Python script to parse it, but it isn't terribly robust. I was
> able to produce a dictionary, but I didn't do anything with the results.
> You're welcome to take a look:
> http://www.probo.com/timr/parsemsf.py


Thanks, I'll work with this. I have to say that this has all been
worth it just to read about Jamie Zawinski railing against this file
format. I think your comment at the top sums it up well:

# Why am I doing this?



All times are GMT. The time now is 12:07 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.