the web as a database

Zbigniew Koziol zkoziol-Zd07PnzKK1IAvxtiuMwx3w at public.gmane.org
Wed Apr 20 19:49:36 UTC 2005


Just an idea. May be someone will want to comment?

Wouldnt it be wonderful that after typing a complex SQL-like query to 
Google I get a precise response with listing of all web pages on the 
Internet that contain a relevant information that interests me?

The problem is that Google-like search engines index the content of HTML 
pages. And HTML was designed to hold information that is supposed to be 
displayed in the browser. It was not designed to categorise that 
information. The HTML meta-tags like keywords and description are merely 
an attempt only to ad some categorization but a poor attempt. Thats why 
good search engines do not treat their content very seriously.

Lets take as an example: for some reason I wanted to build a database of 
all physics laboratories in the world. I would like to know where are 
they located (country, state/province, exact address, what are the main 
subjects of their research, whom to contact there for information, what 
are the names of main researches, etc.) In principle all this 
information does exist already on the web. I do not need to explain 
however that it is extremally tudios and time-consuming task to find it 
out and categorise.

Hence, I am talking about a new sort of the web functionality. Where the 
data could be taken out off, reworked, and displayed in a different way.

Some may suggest the use of XML. Probably a good idea. I do not have 
however a general understanding of what is behind XML. In principle, I 
imagine, web sites could have a special file hosted on their server that 
would contain a detailed information about the content of these web 
sites, or at least about the company. Like in the example above. A sort 
of like now /robots.txt is used, or newsfeed.xml .

Is there no other way?

If somebody would be interested in working with me on introducing that 
sort of new technology - please write. I have already some poor ideas 
how to do that. But I still am very interested in hearing your comments. 
The subject seems to be original, with a huge possible impact on the web 
development. A sort of like creating a new standard.

zb.
--
Zbigniew Koziol, SoftQuake^(tm) Open Source Business Solutions
Web Development, Linux, Web Mail Fax Voice Servers, Networking
Consultations, Innovative Technologies Tel/Fax: 1-416-530-2780
Toronto,  Canada,  http://www.softquake.ca,  info-lcEyp1+e+UdAFePFGvp55w at public.gmane.org



--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list