The World Wide Web is a large, heterogeneous, distributedcollectionof documents connected by hypertext links. The most common technologycurrently used for searching the Web depends on sending information retrieval requests to "index servers" that index as many documents as they can find by navigating the network. One problem with this is that users must be aware of the various index servers (over a dozen of them are currently deployed on the Web), of their strengths and weaknesses, and of the peculiarities of their query interfaces. A more serious problem is that these queries cannot exploit the structure and topology of the document network. In this paper we propose a query language, WebSQL, that takes advantage of multiple index servers without requiring users to know about them, and that integrates textual retrieval withstructure and topology-basedqueries. We give a formal semantics for WebSQL using a calculus based on a novel "virtual graph" model of a document...
Alberto O. Mendelzon, George A. Mihaila, Tova Milo