Finding a proper distribution of translation probabilities is one of the most important factors impacting the effectiveness of a crosslanguage information retrieval system. In th...
This paper develops a general, formal framework for modeling term dependencies via Markov random fields. The model allows for arbitrary text features to be incorporated as eviden...
We demonstrate a phonotactic-semantic paradigm for spoken document categorization. In this framework, we define a set of acoustic words instead of lexical words to represent acous...
We describe a task-based evaluation to determine whether multi-document summaries measurably improve user performance when using online news browsing systems for directed research...
Kathleen McKeown, Rebecca J. Passonneau, David K. ...
This paper describes a method of detecting Japanese Katakana variants from a large corpus. Katakana words, which are mainly used as loanwords, cause problems with information retr...
When applying blind relevance feedback for ad hoc document retrieval, is it possible to identify, a priori, the set of query terms that will most improve retrieval performance? Ca...
The bottleneck for dictionary-based cross-language information retrieval is the lack of comprehensive dictionaries, in particular for many different languages. We here introduce a...