In a corpus of jokes, a human might judge two documents to be the "same joke" even if characters, locations, and other details are varied. A given joke could be retold w...
Latent Semantic Indexing (LSI) is commonly used to match queries to documents in information retrieval applications. LSI has been shown to improve retrieval performance for some, ...
In this paper, we present CoTester, a system designed to decrease the difficulty of testing web applications. CoTester allows testers to create test scripts that are represented ...
The purpose of this paper is to apply and evaluate the bibliometric method Bradfordizing for information retrieval (IR) experiments. Bradfordizing is used for generating core docu...
This paper proposes a method for extracting bilingual text pairs from a comparable corpus. The basic idea of the method is to apply bootstrapping to an existing corpusbased cross-...
Hiroshi Masuichi, Raymond Flournoy, Stefan Kaufman...