We investigate four hierarchical clustering methods (single-link, complete-link, groupwise-average, and single-pass) and two linguistically motivated text features (noun phrase he...
Vasileios Hatzivassiloglou, Luis Gravano, Ankineed...
We present a Java-based framework, SWAMI (Shared Wisdom through the Amalgamation of Many Interpretations) for building and studying collaborative filtering systems. SWAMI consist...
Danyel Fisher, Kris Hildrum, Jason I. Hong, Mark W...
This paper explores the use of hierarchical structure for classifying a large, heterogeneous collection of web content. The hierarchical structure is initially used to train diffe...
Most web pages are linked to others with related content. This idea, combined with another that says that text in, and possibly around, HTML anchors describe the pages to which th...
This article compares search effectiveness when using query-based Internet search (via the Google search engine), directory-based search (via Yahoo) and phrasebased query reformul...
: This paper presents a novel way of examining the accuracy of the evaluation measures commonly used in information retrieval experiments. It validates several of the rules-of-thum...
Abstract We introduce OCELOT, a prototype system for automatically generating the “gist” of a web page by summarizing it. Although most text summarization research to date has ...
Abstract This paper investigates whether a machine can automatically learn the task of finding, within a large collection of candidate responses, the answers to questions. The lea...
Adam L. Berger, Rich Caruana, David Cohn, Dayne Fr...