text indexing on Linux?

Ted ted.leslie-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org
Thu Jul 5 16:38:55 UTC 2012


are the contents basically completely random dictionary words, i.e. a 
set of "words" that can be from 600k+ words?
Or is the contents a small subset of "words".
Also , how many files are you talking about?

-tl

On 07/05/2012 12:31 PM, William Park wrote:
> Hi all,
>
> Suppose all your files are text files and contain 10 words max.  What
> program would you use to index them based on contents?  That is, given a
> set of words, it has to return the name of files that contain those
> words.
>
> I know of "updatedb" and "locate", but they index only filenames, not
> the content.  For my need, "grep" is still faster than any SQL solution,
> but I'm curious as to what is the correct approach.

--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists





More information about the Legacy mailing list