Scientists often search for document-elements like tables, figures, or algorithm pseudo-codes. Domain scientists and researchers report important data, results and algorithms usi...
Text similarity spans a spectrum, with broad topical similarity near one extreme and document identity at the other. Intermediate levels of similarity – resulting from summariza...
Donald Metzler, Yaniv Bernstein, W. Bruce Croft, A...
Easy access to the Web has led to increased potential for students cheating on assignments by plagiarising others’ work. By the same token, Web-based tools offer the potential f...
Raphael A. Finkel, Arkady B. Zaslavsky, Kriszti&aa...
Automatically extracting keyphrases from documents is a task with many applications in information retrieval and natural language processing. Document retrieval can be biased towar...
This paper describes how to automatically cross-reference documents with Wikipedia: the largest knowledge base ever known. It explains how machine learning can be used to identify...