regexp matching question

Walter Dnes waltdnes-SLHPyeZ9y/tg9hUCZPvPmw at
Fri Oct 7 05:13:27 UTC 2005

On Wed, Oct 05, 2005 at 11:28:27PM -0400, Behdad Esfahbod wrote

> I'm not sure what you exactly mean, but AFAIK, a new message is
> started when the regexp "^From:" matches, and the header ends
> when two consecutive new lines (Dos or Unix conventions) match.
> What's wrong with that?

  In ordinary emails, probably nothing.  With maildir, I don't see the
"^From " (*NOT* "^From:" as in your message).  The rule there is that
the headers begin at line 1, and end with the first set of two
consecutive "newlines".

  But what happens when you're on a procmail or anti-spam mailing-list
where people deliberately (and properly, I might add) post sample
headers that match your criteria for a new message???  For working with
procmail when being passed one message at a time (e.g. POP or fetchmail
or analyzing multiple maildir format messages) the way to avoid problems
is to specify a ridiculously high number with formail's "-m" parameter.
I.e. you avoid false splits by effectively telling formail not to split
out messages, regardless of what it sees.

  As I said, that works fine when you *KNOW* that you're working with
*ONLY ONE MESSAGE*.  The mbox format problem is that *ALL* the messages
are kept in one humoungous file.  formail *MUST* attempt to split them
apart using heuristics.  If someone plunks a set of valid sample headers
into the body of message without quoting the headers, that will cause
formail to think it sees a new message.

  The message that I was replying to talked about problems splitting out
messages, and that implies mbox format.

Walter Dnes <waltdnes-SLHPyeZ9y/tg9hUCZPvPmw at>
An infinite number of monkeys pounding away on keyboards will
eventually produce a report showing that Windows is more secure,
and has a lower TCO, than linux.
The Toronto Linux Users Group.      Meetings:
TLUG requests: Linux topics, No HTML, wrap text below 80 columns

More information about the Legacy mailing list