A substantial subset of the web data follows some kind of underlying structure. Nevertheless, HTML does not contain any schema or semantic information about the data it represents...
A substantial subset of the web data follows some kind of underlying structure. In order to let software programs gain full benefit from these “semistructured” web sources, wra...
In this paper, we propose a method for mediatory summarization, which is a novel technique for facilitating users' assessments of the credibility of information on the Web. A...
In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...
We present a new approach in web search engines. The web creates new challenges for information retrieval. The vast improvement in information access is not the only advantage res...