http://qs321.pair.com?node_id=727061

blazar has asked for the wisdom of the Perl Monks concerning the following question:

I personally believe that once again I have a problem for which (more than) a very simple module (in terms of UI) must be provided, except that I can't find a suitable one, most probably due to a misunderstanding on my part.

Indeed a Mailbox search gives me quite a lot of hits, but to help you to help me I must be more precise about what I want to do. Specifically, I have a series of archived (well, "compressed," more technically) standard *NIX mailboxes and I would like to write a filter that would behave fundamentamentally like this:

bzip2 -qcd saved-messages*.bz2 | ./filter.pl > goodones

Where goodones would still be a *NIX standard mailbox, and filter.pl would select on some headers and the actual message body. I only need a direction on the module to use for this. Any Idea? The modules I found with the previous CPAN search give me the impression of doing "kinda too much" and I'm slightly loss.

--
If you can't understand the incipit, then please check the IPB Campaign.

Replies are listed 'Best First'.
Re: How to filter a *NIX Mailbox
by almut (Canon) on Dec 01, 2008 at 12:12 UTC
Re: How to filter a *NIX Mailbox
by McD (Chaplain) on Dec 01, 2008 at 13:52 UTC
    The modules I found with the previous CPAN search give me the impression of doing "kinda too much" and I'm slightly loss.

    I think you might be on to something there. The standard unix mailbox file format is pretty simple, you might not want to load up with a heavy mail processing module at all.

    A single mail message consists of headers, a blank line, and the body. Mailbox files separate these with lines that begin "From " - case sensitive. So you can write up code that looks for:

    /^From / and ... # New message, next line begins the headers /^$/ and ... # Blank line, next line begins the body
    Interesting bit of trivia - most mail transfer agents will prefix any line in a mail message that they process that beings "From " with a ">", like a citation, so that this:

    From Tuesday on, I'll be on vacation.
    becomes this:

    >From Tuesday on, I'll be on vacation.
    ...just to keep that mail message from screwing up the mailbox format when it gets delivered somewhere.

    Anyway, if you're not too concerned about parsing MIME structures but just want something closer to a straight grep, taking apart the mailbox file by hand isn't very heavy lifting at all.

Re: How to filter a *NIX Mailbox
by b10m (Vicar) on Dec 02, 2008 at 09:55 UTC