search engine pollution
Emma Jane Hogbin
emmajane-MHIYrZpDPrNWk0Htik3J/w at public.gmane.org
Fri Apr 23 18:14:40 UTC 2004
On Wed, Apr 21, 2004 at 12:19:26AM +0300, Peter L. Peres wrote:
> I have htdig installed to search documents on my own machine and I have
> noticed that a particular document may be hard to find because it seldomly
> refers to itself, whereas many others refer to it. F.ex. searching for rfc
> ftp one will find all the assigned numbers and protocol lists, with the
> real ftp protocols scoring and ranking low (in the first few tens over
> 700+ matches in this case). What would be a way to improve this without
> using META tags and such (not all documents are text). Using the full
> title also does not help. The referrers also use the full title ...
Assuming the pages are marked up with some kind of semantic markup
language, you can adjust the rankings of the headings and titles of a
document. You really need to read the documentation that goes with
ht://dig. http://www.htdig.org/confindex.html Specifically:
title_factor: http://www.htdig.org/attrs.html#title_factor
heading_factor: http://www.htdig.org/attrs.html#heading_factor
emma
--
Emma Jane Hogbin
[[ 416 417 2868 ][ www.xtrinsic.com ]]
--
The Toronto Linux Users Group. Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
More information about the Legacy
mailing list