The task of identifying redundant information in documents that are generated from multiple sources provides a significant challenge for summarization and QA systems. Traditional ...
In this paper we propose a completely unsupervised method for open-domain entity extraction and clustering over query logs. The underlying hypothesis is that classes defined by mi...
Search engine result pages (SERPs) are known as the most expensive real estate on the planet. Most queries yield millions of organic search results, yet searchers seldom look beyon...
Multi-document summarization aims to create a compressed summary while retaining the main characteristics of the original set of documents. Many approaches use statistics and mach...
Dingding Wang, Tao Li, Shenghuo Zhu, Chris H. Q. D...
The application of document clustering to information retrieval has been motivated by the potential effectiveness gains postulated by the Cluster Hypothesis. The hypothesis states ...