This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Since WWW encourages hypertext and hypermedia document authoring (e.g. HTML or XML), Web authors tend to create documents that are composed of multiple pages connected with hyperl...
This paper describes research in methods to discover Web Service Description Language (WSDL) documents. This work extends current discovery research through use of the Google Web ...
The Semantic Web seems to be evolving into a property-linked web of RDF data, conceptually divorced from (but physically housed in) the hyperlinked web of HTML documents. We discus...
In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...