We address the task of answering natural language questions by using the large number of Frequently Asked Questions (FAQ) pages available on the web. The task involves three steps...
A system, called NewsStand, is introduced that automatically extracts images from news articles. The system takes RSS feeds of news article and applies an online clustering algori...
Intelligent access to information requires semantic integration of structured databases with unstructured textual resources. While the semantic integration problem has been widely...
In this paper we investigate a novel and important problem in multi-document summarization, i.e., how to extract an easy-tounderstand English summary for non-native readers. Exist...
This paper explores correspondence and mixture topic modeling of documents tagged from two different perspectives. There has been ongoing work in topic modeling of documents with...