We propose a method for constructing a vector for a document image to represent its content to facilitate text retrieval. The method is based on an N-Gram algorithm for text simil...
Never before have so many information sources been available. Most are accessible on-line and some exist on the Internet alone. However, this large information quantity makes inte...
In this paper, we present a multimodal parallel text-image corpus, and propose an image annotation method that exploits the textual information associated with images. Our corpus ...
We describe a task-based evaluation to determine whether multi-document summaries measurably improve user performance when using online news browsing systems for directed research...
Kathleen McKeown, Rebecca J. Passonneau, David K. ...
We are presenting a text analysis tool set that allows analysts in various fields to sieve through large collections of multilingual news items quickly and to find information that...