regexp matching question
Behdad Esfahbod
behdad-26n5VD7DAF2Tm46uYYfjYg at public.gmane.org
Thu Oct 6 03:28:27 UTC 2005
On Wed, 5 Oct 2005, Walter Dnes wrote:
> On Thu, Oct 06, 2005 at 12:58:36AM +0300, Peter wrote
>
> > The problem is that formail fails to properly split certain files
> > and I do not know why. I tried to understand the problem but formail
> > source is un-maintainable imho. Thus I am trying to work around it
> > using my own matcher. That's where the regexp question came in.
>
> The problem is that there are no "out-of-band" separaters in an mbox
> file. There are certain header-pattern conventions which imply a
> separator. You can run into situations where the text of an email
> message contains what looks like headers/separators, even to the most
> sophisticated matching algorithm. It can be hard to separate email body
> >From headers<g>. It's not quite as bad as Microsoft's
> "begin loveletter.txt.exe" cockup, but the principle is the same. If
> you subscribe to a procmail or other anti-spam list, where people post
> sample email headers, you *WILL* see formail screw up the splits. This
> is one area where maildir format reigns supreme.
I'm not sure what you exactly mean, but AFAIK, a new message is
started when the regexp "^From:" matches, and the header ends
when two consecutive new lines (Dos or Unix conventions) match.
What's wrong with that?
--behdad
http://behdad.org/
--
The Toronto Linux Users Group. Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
More information about the Legacy
mailing list