regexp

John Macdonald john-Z7w/En0MP3xWk0Htik3J/w at public.gmane.org
Sat Apr 24 21:29:28 UTC 2004


On Sat, Apr 24, 2004 at 11:01:30PM +0300, Peter L. Peres wrote:
> 
> On Sat, 24 Apr 2004, John Macdonald wrote:
> 
> > That will reject "abc=deh" which I'd include in the
> > specified "def and anything else".  You need:
> >
> >      /abc=(([a-eg-z][a-z][a-z])|([a-z][a-fh-z][a-z])|([a-z][a-z][a-gi-z]))/
> >
> > It gets even messier if you also want to allow
> > other than exactly 3 lower case letters to be in the
> > assigned value.
> 
> Argh. I did something like this in Perl:
> 
> while (<>) {
> 	if ( /=<([^>]*)>/ && ($tm = "\Q$1") && ($tm ne "foo\@bar\.baz") )  {
> 		print 'got it = ($t)';
> 	}
> }
> 
> which is ugly beyond words. There has got to be a better way. Later I'll
> want to prune matches to $1 by a list so I'll likely use a hash or a
> function for the .ne. part. If you haven't guessed yet, this is about
> pruning certain email addresses from a list extracted from mail logs
> (whitelisting/blacklisting etc). The above works and was tested on a log
> with 20000+ lines, which it managed in a couple of tens of seconds on a
> slow machine (with cpu load 0.95 over ~3000 matches). Surely there is a
> way to specify a negative match-block in regexp ?! Anyway when I tried to
                                                  ^^

You are so close!  The perl pattern (?!PAT) does a
forward lookahead and if PAT can match at this point,
(?!PAT) fails.  If the trial match of PAT fails,
(?!PAT) succeeds (without actually matching any
characters in the string, this is only an assertion.

The original question could be matched with:

    /abc=(?!fgh)\w{3}/

but he wasn't looking for perl, but regex library
pattern.

Your pattern:

	if ( /=<([^>]*)>/ && ($tm = "\Q$1") && ($tm ne "foo\@bar\.baz") )  {
		print 'got it = ($t)';
	}

could be written as:

	if ( /=<(?!foo\@bar\.baz)([^>]*)>/ )  {
		print "got it = (\Q$1\E)";
	}

> condense the above if() into a single expression using lookahead in Perl
> (?! etc) it did not work as I feel it should. This is my first time with
> lookahead so it may be I am doing something wrong. Could someone rewrite
> the above using lookahead as an example ? I am hoping to be able to
> rewrite this using compiled regexps in C and make it more efficient.
> 
> tia, and thanks for all who posted so far,
> Peter
> 
> 
> --
> The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
> TLUG requests: Linux topics, No HTML, wrap text below 80 columns
> How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml

-- 
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list