In TREC 2003, our experiments have been concentrated only on the topic distillation task. We first simply apply the term-based technique to the .GOV web collection, and then re-r...
Traditional web link-based ranking schemes use a single score to measure a page’s authority without concern of the community from which that authority is derived. As a result, a...
TopicShop is an interface that helps users evaluate and organize collections of web sites. The main interface components are site profiles, which contain information that helps us...
Brian Amento, Loren G. Terveen, William C. Hill, D...
The rapid growth of the World-Wide Web poses unprecedented scaling challenges for general-purpose crawlers and search engines. In this paper we describe a new hypertext resource d...
Soumen Chakrabarti, Martin van den Berg, Byron Dom
Given only the URL of a web page, can we identify its topic? This is the question that we examine in this paper. Usually, web pages are classified using their content [7], but a U...