Web sites are designed for graphical mode of interaction. Sighted users can "cut to the chase" and quickly identify relevant information in Web pages. On the contrary, i...
Abstract. This paper introduces an approach to address the problem of accessing conventional and geographic data from the Deep Web. The approach relies on describing the relevant d...
Helena Piccinini, Melissa Lemos, Marco A. Casanova...
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
Planning out maintenance tasks to increase the quality of Web applications can be difficult for a manager. First, it is hard to evaluate the precise effect of a task on quality. S...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...