parsing HTML with awk or sed

William Park opengeometry-FFYn/CNdgSA at public.gmane.org
Wed Feb 25 22:52:03 UTC 2009


> > some are single line:
> > 
> > <p>data</p>
> > 
> > and some are multi-line:
> > 
> > <p>
> > More data
> > 
> > </p>
> 
> And some are:
> <p>stuff
> <p>other stuff
> <p>yet more stuff

Your main problem is determining the end of P tag.  If it's terminated with </p>, then there are many ways to cut/slice your file.  If it's not terminated, then find ways to terminate it.
--William


      __________________________________________________________________
Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now at
http://ca.toolbar.yahoo.com.
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists





More information about the Legacy mailing list