A scripting question (harder than I thought)

Walter Dnes waltdnes-SLHPyeZ9y/tg9hUCZPvPmw at public.gmane.org
Sun Feb 29 01:02:53 UTC 2004


On Fri, Feb 27, 2004 at 02:18:29AM -0500, William Park wrote
> On Fri, Feb 27, 2004 at 01:16:07AM -0500, Walter Dnes wrote:
> > 
> >   How do I remove every second end-of-line to achieve this ?
> 
> sed 'N;s/\n//'

  It works, and almost no moving parts.  As a consultant, you're
probably used to customers giving you a spec, and when you come up with
a solution to their spec, they say "That's not what I really meant"<g>.
It turns out that I can't "delete every second line".  Some DNSbls, like
sorbs and spamhaus) throw a monkey-wrench into things by returning more
than 1 line in the %TXT% message.  Example...

> Wed Feb 11 16:58:49 2004 (gongguail21-DVq+YeWbj7f7Za/I2yyZNw at public.gmane.org) -> (waltdnes at waltdnes.org) [221.150.38.95]
> Rejected due to lack of hostname. If yours was a legitimate email see http://www.waltdnes.org/bypass.html to bypass block.
> 
> Wed Feb 11 19:01:42 2004 (b446ahi-O5WfVfzUwx8 at public.gmane.org) -> (waltdnes at waltdnes.org) [24.47.205.208]
> HTTP Proxy See: http://www.dnsbl.sorbs.net/cgi-bin/lookup?IP=24.47.205.208
> SOCKS Proxy See: http://www.dnsbl.sorbs.net/cgi-bin/lookup?IP=24.47.205.208
> Spam Received See: http://www.dnsbl.sorbs.net/cgi-bin/lookup?IP=24.47.205.208
> Dynamic IP Address See: http://www.dnsbl.sorbs.net/cgi-bin/lookup?IP=24.47.205.208 Email rejected on advice of dnsbl.sorbs.net. If yours was a legitimate email see http://www.waltdnes.org/bypass.html to bypass block.
> 
> Wed Feb 11 19:28:06 2004 (News-8EyPGC1My3pUtMC7kmziAtBPR1lH4CV8 at public.gmane.org) -> (waltdnes at waltdnes.org) [80.64.107.202]
> Rejected due to lack of hostname. If yours was a legitimate email see http://www.waltdnes.org/bypass.html to bypass block.

  I use the aggregate sorbs zone to cut down on DNS traffic.  Sorbs has
that IP address listed under 4 categories, and returns 4 lines.  Do I
need to use Python or something similar ?  The rules for the logfile are

1) The "anchor line" begins with regex "^(Mon|Tue|Wed|Thu|Fri|Sat|Sun) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) "

and ends with "]"

2) Followed N lines (N > 0) of text, which do not match rule 1.

3) Followed by one zero-length line

  I think that a script-based approach would work like so

  - find a line that does *NOT* match rule 1
  - join it to the previous line

-- 
Walter Dnes <waltdnes-SLHPyeZ9y/tg9hUCZPvPmw at public.gmane.org>
Email users are divided into two classes;
1) Those who have effective spam-blocking
2) Those who wish they did
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list