Best tools for making a spider?

Alex Beamish talexb-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org
Wed Jan 3 22:16:10 UTC 2007


On 1/3/07, Evan Leibovitch <evan-ieNeDk6JonTYtjvyW6yDsg at public.gmane.org> wrote:
>
> This isn't meant to start a flamewar, honestly.
>
> I'm just wondering if there are some languages that are either optimized
> or otherwise more suitable than others for the specific task of writing
> a specialized web spider.
>
> So far the people I've spoken to say this is a perfect task for Ruby,
> but I don't know it at all. There is no previous baggage on this
> particular project so we can choose any tool we want -- but once chosen
> I'd like to stay with it. Having some existing open source templates or
> existing code to build upon is always nice too.
>
> I'm not going to be the programmer (we do want working code, after all
> :-) ) but I do have some say in the tech to be used. Any suggestions or
> pointers of where to explore this particular question are appreciated.


I would expect you could use Andy Lester's WWW::Mechanize for spidering. See

  http://search.cpan.org/~petdance/WWW-Mechanize-1.20/lib/WWW/Mechanize.pm

for lots more information. That page also links to O'Reilly's *Spidering
Hacks*

  http://www.oreilly.com/catalog/spiderhks/

which would probably be a useful resource no matter what language platform
you choose.

-- 
Alex Beamish
Toronto, Ontario
aka talexb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gtalug.org/pipermail/legacy/attachments/20070103/d6f38351/attachment.html>


More information about the Legacy mailing list