regexp matching question

Tony Abou-Assaleh taa-HInyCGIudOg at public.gmane.org
Wed Oct 5 16:09:14 UTC 2005


If you would like to 'grep' a mailbox file and extract messages containing
some RE, the easiest and fastest way I know of is using cgrep from the
University of Waterloo.

I have a link to the source code + a report that shows how to do the above
at:

http://www.cosc.brocku.ca/~taa/greps.html

Using regexec might not be the way to go because it requires the entire
string to be in memory. If you want to deal with large strings (stored in
files) properly, then you'd be reinventing the grep program, so just look
at the source code instead.

Cheers,

TAA

-----------------------------------------------------
Tony Abou-Assaleh
Lecturer, Computer Science Department
Brock University, St. Catharines, ON, Canada, L2S 3A1
Office: MC J215
Tel:    +1(905)688-5550 ext. 5243
Fax:    +1(905)688-3255
Email:  taa-HInyCGIudOg at public.gmane.org
WWW:    http://www.cosc.brocku.ca/~taa/
----------------------[THE END]----------------------

On Wed, 5 Oct 2005, Peter wrote:

>
> Hi all
>
> I need to match email messages using regexec(3). I would like to match
> as much as possible in a piece, i.e. the interesting headers and the
> body (which could be large). Can this be done and is it economical
> (speedwise) to use a single hairy regexp to match the whole message or
> is it better to match the message and then parse it ? formail already
> does this somehow (I have not looked yet).
>
> Peter
> --
> The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
> TLUG requests: Linux topics, No HTML, wrap text below 80 columns
> How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
>
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list