Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
I have a couple of questions related to storage of mail messages in unix mailboxes...
The Iron Mail spam blocker can send quarentined mail to an external location -- a unix mailbox... Do you'all know how these mailboxes are stored, (flat file, binary file etc) I would like to know if the headers for messages in a box can be extracted and indexed so that messages can be recalled by some message id or other key(s)...
Can message headers be read by Perl or C programs? Is there a command line interface to smtp mail that allows access to mailboxes.. I have heard of mail, sendmail and mailx..for simple mailing in unix system.
Sorry if this is too vauge -- even if you knew of some good books that might help me exploit mail files for business purposes....
thanks.
Re: Perl & mail headers
by Stevie-O (Friar) on Apr 30, 2004 at 04:50 UTC
|
You're probably looking for something that will read the 'mbox' format -- Mail::MboxParser will do it. Or you can probably give Mail::Folder a shot, as it has an mbox driver. Here's a few more that came up when searching CPAN for 'mbox':
I strongly recommend you check out each of those, and decide which one has the features you need while being easiest to use.
--Stevie-O
$"=$,,$_=q>|\p4<6 8p<M/_|<('=>
.q>.<4-KI<l|2$<6%s!<qn#F<>;$,
.=pack'N*',"@{[unpack'C*',$_]
}"for split/</;$_=$,,y[A-Z a-z]
{}cd;print lc
| [reply] [Watch: Dir/Any] [d/l] |
Re: Perl & mail headers
by sgifford (Prior) on Apr 30, 2004 at 04:55 UTC
|
Unix mailboxes are flat files. Each message starts with the pattern:
/^From /
Immediately following the From_ are the message headers, followed by a blank line, followed by the body of the message. The message ends at the next From_ or at EOF.
As Stevie-O suggests, you'll probably want to use a mail parsing module for this. It's not that hard to write your own, but almost nobody gets it right the first time, and it's easier to just use one of the modules
| [reply] [Watch: Dir/Any] [d/l] |
|
Lines in the messages matching /^From / should be escaped (with a dash), but it's usually better to pay attention to the Content-Length and Content-Lines headers, provided they exist.
Please note that rolling your own can cause some unexpected damage, and that it's been done already (usually pretty well (and many times over)), and made available on the CPAN (see the previous posts).
Good luck!
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
++ Thanks for the explanation. I was wondering what the file format actually was for mbox. Makes a lot of sense. Cheers!
| [reply] [Watch: Dir/Any] |
Re: Perl & mail headers
by coec (Chaplain) on Apr 30, 2004 at 04:54 UTC
|
Firstly, have you looked on cpan (http://search.cpan.org/search?query=mail&mode=module)? There are many mail modules, you may find something useful there.
Secondly, mail, mailx and sendmail are very different beasts. The first two are mail clients, they read whats in your mailbox and can send mail to the MTA (mail transfer agent). Sendmail is a SMTP server (I won't go into why it shouldn't be used). SMTP - simple mail transfer protocol. The SMTP server is the thing that does the work, delivering mail so your client can read it.
There is shell access to sendmail but it isn't pretty.
You can read/parse mail headers with Perl, C and other languages. For Perl modules see the CPAN link above.
CC
The above is an over simplification of how mail works. | [reply] [Watch: Dir/Any] |
Re: Perl & mail headers
by McD (Chaplain) on Apr 30, 2004 at 14:59 UTC
|
Anonymous Monk writes:
even if you knew of some good books that might help me exploit mail files for business purposes
Here's the best I've found on the topic - it's very good:
Programming Internet Email
What a surprise, it's from O'Reilly. :-)
Good luck!
Peace,
-McD
| [reply] [Watch: Dir/Any] |
|
|