Re: Perl script to identify corrupt mbox messages?

Discussion in 'Firefox' started by Mumia W., Jun 16, 2007.

  1. Mumia W.

    Mumia W. Guest

    On 06/16/2007 02:24 AM, Tuxedo wrote:
    > [...]
    > formail -ds <my_crappy_mbox >>reinvigorated_mbox
    >
    > .... this certainly made some changes, in fact, 10 or so additional messages
    > appear in the Mozilla index which did not show up earlier, including a
    > couple without a valid sender which are now listed by Mozilla as from
    > foo@bar, but which appear to be file fragments, i.e. not real mail.
    >
    > Most of the 3000+ messages, however, still do not show up in Mozilla.
    >
    > So I tried: ...
    > formail -zds <my_crappy_mbox >>reinvigorated_mbox
    > ... but this made the file no more readable in Mozilla than the previous try.
    >
    > and ...
    > formail -rds <my_crappy_mbox >>reinvigorated_mbox
    > ... but with the same result as the former try.
    >
    > Naturally I removed the generated (.msf) index files as well as terminated
    > the Mozilla application between the tries, in case something would get
    > cached otherwise.
    >
    > The Mozilla application simply appears to be choking on the mbox while
    > building the index. The progress bar is helplessly trying to move forward,
    > but then falls back, then forward a bit, and then back again, until it
    > finally gives up. In other words, the graphical indicator at the bottom
    > right of the application, which is meant to indicate the progress of
    > building the index, never reaches its maximum.
    >
    > Perhaps the mbox contains some very odd characters, maybe part of some
    > attachment, which causes Mozilla but not other mail clients to choke.
    > Perhaps it is the result of some malformatted mail circulating via zoombie
    > machines, Outlook and whatever, that affects Mozilla on multiple platforms.
    >


    Research the problem with the help of this website:
    http://kb.mozillazine.org/

    In particular, this article may (or may not) be of help:
    http://kb.mozillazine.org/Inbox_stays_blank

    Here is a script that, might improve things a little bit:

    use strict;
    use warnings;
    require FileHandle;
    require Email::Folder;
    require Date::parse;
    require POSIX;
    Date::parse->import('str2time');
    POSIX->import('ctime');

    my $file = glob('~/tmp/mozmail/OldTests');
    my $outfile = 'output.mbox';

    my $fh = FileHandle->new($outfile, '>') or die("Stop: $!");
    my $folder = Email::Folder->new($file);

    my $count = 0;
    while (my $msg = $folder->next_message) {
    my $date = $msg->header('Date');
    $date = ctime(str2time($date)); chomp $date;
    $fh->print("From - $date\n");
    $fh->print($msg->as_string() . "\n");
    $count++;
    }
    print "There are $count messages in the folder.\n";

    $fh->close;

    Email::Folder and Date::parse are modules you can download from CPAN.
    The other modules are standard parts of Perl. You should change $file
    and $outfile as appropriate. You shouldn't modify the original mailbox file.

    Probably, you'll not need the script. Things should improve after you've
    deleted the .msf (index) file and closed an reopened Mozilla.

    (Followups set to alt.fan.mozilla)
     
    Mumia W., Jun 16, 2007
    #1
    1. Advertising

  2. Mumia W.

    Tuxedo Guest

    Mumia W. wrote:

    > On 06/16/2007 02:24 AM, Tuxedo wrote:
    > > [...]
    > > formail -ds <my_crappy_mbox >>reinvigorated_mbox
    > >
    > > .... this certainly made some changes, in fact, 10 or so additional
    > > messages appear in the Mozilla index which did not show up earlier,
    > > including a couple without a valid sender which are now listed by
    > > Mozilla as from foo@bar, but which appear to be file fragments, i.e. not
    > > real mail.
    > >
    > > Most of the 3000+ messages, however, still do not show up in Mozilla.
    > >
    > > So I tried: ...
    > > formail -zds <my_crappy_mbox >>reinvigorated_mbox
    > > ... but this made the file no more readable in Mozilla than the previous
    > > try.
    > >
    > > and ...
    > > formail -rds <my_crappy_mbox >>reinvigorated_mbox
    > > ... but with the same result as the former try.
    > >
    > > Naturally I removed the generated (.msf) index files as well as
    > > terminated the Mozilla application between the tries, in case something
    > > would get cached otherwise.
    > >
    > > The Mozilla application simply appears to be choking on the mbox while
    > > building the index. The progress bar is helplessly trying to move
    > > forward, but then falls back, then forward a bit, and then back again,
    > > until it finally gives up. In other words, the graphical indicator at
    > > the bottom right of the application, which is meant to indicate the
    > > progress of building the index, never reaches its maximum.
    > >
    > > Perhaps the mbox contains some very odd characters, maybe part of some
    > > attachment, which causes Mozilla but not other mail clients to choke.
    > > Perhaps it is the result of some malformatted mail circulating via
    > > zoombie machines, Outlook and whatever, that affects Mozilla on multiple
    > > platforms.
    > >

    >
    > Research the problem with the help of this website:
    > http://kb.mozillazine.org/
    >
    > In particular, this article may (or may not) be of help:
    > http://kb.mozillazine.org/Inbox_stays_blank
    >
    > Here is a script that, might improve things a little bit:
    >
    > use strict;
    > use warnings;
    > require FileHandle;
    > require Email::Folder;
    > require Date::parse;
    > require POSIX;
    > Date::parse->import('str2time');
    > POSIX->import('ctime');
    >
    > my $file = glob('~/tmp/mozmail/OldTests');
    > my $outfile = 'output.mbox';
    >
    > my $fh = FileHandle->new($outfile, '>') or die("Stop: $!");
    > my $folder = Email::Folder->new($file);
    >
    > my $count = 0;
    > while (my $msg = $folder->next_message) {
    > my $date = $msg->header('Date');
    > $date = ctime(str2time($date)); chomp $date;
    > $fh->print("From - $date\n");
    > $fh->print($msg->as_string() . "\n");
    > $count++;
    > }
    > print "There are $count messages in the folder.\n";
    >
    > $fh->close;
    >
    > Email::Folder and Date::parse are modules you can download from CPAN.
    > The other modules are standard parts of Perl. You should change $file
    > and $outfile as appropriate. You shouldn't modify the original mailbox
    > file.
    >
    > Probably, you'll not need the script. Things should improve after you've
    > deleted the .msf (index) file and closed an reopened Mozilla.
    >
    > (Followups set to alt.fan.mozilla)
    >


    Excellent! However, the problem does not want to be so easily solved. It
    was no problem getting the above script running with the 2 up-to-date and
    non-standard modules, and after having saved the script, as fixbox.pl, I
    sucessfully tested it on a small mbox file containing only 3 messages.

    However, with the real file, and when using a 2GH notebook with 512MB
    memory and Perl 5.8.7, munching through the approximately 150MB mbox, the
    above script (or the shell) returned: "Out of Memory!". The resulting
    'output.mbox' file remained empty.

    Personally, I'm not a fan of Mozilla mail, and looking a bit closer, I
    could not find a solution to this particular issue on kb.mozillazine.org,
    either. The problematic mailbox is someone's else. I'm seriously
    contemplating telling them to: 1) abandon Windows, 2) Mozilla mail.
     
    Tuxedo, Jun 16, 2007
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. mkinsman

    NAS Perl/CGI Script

    mkinsman, Nov 23, 2003, in forum: Cisco
    Replies:
    0
    Views:
    608
    mkinsman
    Nov 23, 2003
  2. tomviolin
    Replies:
    9
    Views:
    1,103
    Richard Graves
    Apr 26, 2005
  3. Tuxedo
    Replies:
    0
    Views:
    613
    Tuxedo
    Jun 16, 2007
  4. Tuxedo
    Replies:
    0
    Views:
    521
    Tuxedo
    Jun 17, 2007
  5. business one way

    Is it a Perl program or a Perl script?

    business one way, Jan 5, 2008, in forum: Digital Photography
    Replies:
    0
    Views:
    763
    business one way
    Jan 5, 2008
Loading...

Share This Page