The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
In this paper, we describe a demonstration called an “End-to-End” demonstration developed for the 2005 offering of our grid computing course that was taught across the State o...
Electronic mail poses a number of unusual challenges for the design of information retrieval systems and test collections, including informal expression, conversational structure,...
Although text categorization is a burgeoning area of IR research, readily available test collections in this field are surprisingly scarce. We describe a methodology and system (...
Abstract. The TextMap-TMT cross-language question answering system at USC-ISI was designed to answer Spanish questions from English documents. The system is fully automatic, includ...
Abdessamad Echihabi, Douglas W. Oard, Daniel Marcu...