Abstract Models of document indexing and document retrieval have been extensively studied. The integration of these two classes of models has been the goal of several researchers b...
An important class of searches on the world-wide-web has the goal to find an entry page (homepage) of an organisation. Entry page search is quite different from Ad Hoc search. Ind...
Abstract. The automatic detection of shared content in written documents –which includes text reuse and its unacknowledged commitment, plagiarism– has become an important probl...
Researchers spent a large amount of their time searching through an ever increasing number of scientific articles. Although users of scientific search engines prefer the ranking o...
This paper presents Carnegie Mellon University’s experiments on the mixed named-page and homepage finding task of the TREC 12 Web Track. Our results were strong; we achieved the...