Tools to break apart large log files.

Christopher Browne cbbrowne-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org
Thu Dec 10 03:51:39 UTC 2009


On Wed, Dec 9, 2009 at 10:22 PM, Scott Sullivan <scott-lxSQFCZeNF4 at public.gmane.org> wrote:
> The log files have a time stamp at the front of each line, like this:
>
> [2009/12/09 19:10]  UserA: Text
> [2009/12/09 19:11]  UserB: Responce
> [2009/12/09 19:11]  UserB: More Text
> [2009/12/09 19:11]  UserA: ??? Profit
>
> I want something that will register when the gap between the time stamps is
> greater then a given value and split them out to separate files.

When I saw the title, I first thought about split.  (man 1 split)

The thing is, you're not so much "splitting files"; you are
classifying their contents, with a somewhat ill-defined way of
characterizing the partitions.

There are actually two pieces to the problem, and you'll need to
specify both in order to head to a solution:

a) There's a parsing problem, of determining the regular way of
picking out records and recognizing the bits used for classification.
The other Chris suggested awk as a tool that could be helpful for
this, which seems quite plausible.

There's also...

b) There's a naming problem, determining the names of the files that
you want to put those records into.  You need a consistent set of
rules for determining those names, and of opening them to put the
relevant bits into them.

I suspect awk is somewhat less useful for that; most flavours of it
don't have an open() function (or equivalent) to allow you to
establish new streams.

I'd be a bit more inclined to use Perl (familiarity + contempt) or
Lisp; if you were comfortable with Python, that would be a good
choice.  There are lots of plausible choices amongst the soi-disant
"scripting languages."

If you're opening up a lot of streams, and picking output filenames,
you'll need clear policy for b), otherwise the process will be mighty
fragile.
-- 
http://linuxfinances.info/info/linuxdistributions.html
Mike Ditka  - "If God had wanted man to play soccer, he wouldn't have
given us arms." -
http://www.brainyquote.com/quotes/authors/m/mike_ditka.html
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists





More information about the Legacy mailing list