Technology in the field of digital media generates huge amounts of nontextual information, audio, video, and images, along with more familiar textual information. The potential for...
Multi-document summarization aims to create a compressed summary while retaining the main characteristics of the original set of documents. Many approaches use statistics and mach...
Dingding Wang, Tao Li, Shenghuo Zhu, Chris H. Q. D...
The production of gold standard corpora is time-consuming and costly. We propose an alternative: the ‚silver standard corpus‗ (SSC), a corpus that has been generated by the ha...
Dietrich Rebholz-Schuhmann, Antonio Jimeno-Yepes, ...
Every time a user engaged in work reads or writes, the user spontaneously generates new information needs: to understand the text he or she is reading or to supply more substance ...
David A. Evans, Gregory Grefenstette, Yan Qu, Jame...
Web search engines discover indexable documents by recursively ‘crawling’ from a seed URL. Their rankings take into account link popularity. While this works well, it introduc...
Tom Rowlands, David Hawking, Ramesh Sankaranarayan...