Spiders and crawlers
Lennart Sorensen
lsorense-1wCw9BSqJbv44Nm34jS7GywD8/FfD2ys at public.gmane.org
Mon Apr 5 16:41:28 UTC 2010
On Thu, Apr 01, 2010 at 05:56:35PM -0400, Evan Leibovitch wrote:
> I'm looking to implement a spidering system intended to look through a bunch
> of catalog websites, in order to track changes to those catalogs (with the
> help of a backend MySQL system).
I always wonder: Why mysql? Postgresql is an obviously better and more
scalable choice. Why do so many people just barge ahead with mysql?
> The Wikipedia entry for "web crawler" returns a lot of interesting choices;
> I'm wondering is anyone here has experience in either writing one or using
> an existing open source one. I'm hoping for something that is reasonably
> configurable so that one doesn't need to know a language like C or Java to
> make minor config changes.
>
> Any help is appreciated.
Well I haven't ever done that. :)
--
Len Sorensen
--
The Toronto Linux Users Group. Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists
More information about the Legacy
mailing list