The quality of document content, which is an issue that is usually ignored for the traditional ad hoc retrieval task, is a critical issue for Web search. Web pages have a huge var...
This paper describes the development of algorithms for extracting the title and the names of the authors from documents available on the World Wide Web. In this paper we describe ...
Eric G. Berkowitz, Mohamed Reda Elkhadiri, Tim Sah...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Web search is a key technology of the Web, since it is the primary way to access content on the Web. Current standard Web search is essentially based on a combination of textual ke...
Meaningful evaluation of web search must take account of spam. Here we conduct a user experiment to investigate whether satisfaction with search engine result pages as a whole is ...
Timothy Jones, David Hawking, Paul Thomas, Ramesh ...