the web as a database
Zbigniew Koziol
zkoziol-Zd07PnzKK1IAvxtiuMwx3w at public.gmane.org
Wed Apr 20 19:49:36 UTC 2005
Just an idea. May be someone will want to comment?
Wouldnt it be wonderful that after typing a complex SQL-like query to
Google I get a precise response with listing of all web pages on the
Internet that contain a relevant information that interests me?
The problem is that Google-like search engines index the content of HTML
pages. And HTML was designed to hold information that is supposed to be
displayed in the browser. It was not designed to categorise that
information. The HTML meta-tags like keywords and description are merely
an attempt only to ad some categorization but a poor attempt. Thats why
good search engines do not treat their content very seriously.
Lets take as an example: for some reason I wanted to build a database of
all physics laboratories in the world. I would like to know where are
they located (country, state/province, exact address, what are the main
subjects of their research, whom to contact there for information, what
are the names of main researches, etc.) In principle all this
information does exist already on the web. I do not need to explain
however that it is extremally tudios and time-consuming task to find it
out and categorise.
Hence, I am talking about a new sort of the web functionality. Where the
data could be taken out off, reworked, and displayed in a different way.
Some may suggest the use of XML. Probably a good idea. I do not have
however a general understanding of what is behind XML. In principle, I
imagine, web sites could have a special file hosted on their server that
would contain a detailed information about the content of these web
sites, or at least about the company. Like in the example above. A sort
of like now /robots.txt is used, or newsfeed.xml .
Is there no other way?
If somebody would be interested in working with me on introducing that
sort of new technology - please write. I have already some poor ideas
how to do that. But I still am very interested in hearing your comments.
The subject seems to be original, with a huge possible impact on the web
development. A sort of like creating a new standard.
zb.
--
Zbigniew Koziol, SoftQuake^(tm) Open Source Business Solutions
Web Development, Linux, Web Mail Fax Voice Servers, Networking
Consultations, Innovative Technologies Tel/Fax: 1-416-530-2780
Toronto, Canada, http://www.softquake.ca, info-lcEyp1+e+UdAFePFGvp55w at public.gmane.org
--
The Toronto Linux Users Group. Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
More information about the Legacy
mailing list