Stemming can improve retrieval accuracy, but stemmers are language-specific. Character n-gram tokenization achieves many of the benefits of stemming in a language independent way,...
We consider a model of analog computation which can recognize various languages in real time. We encode an input word as a point in Rd by composing iterated maps, and then apply i...
This paper presents a geometric approach to meaning representation within the framework of continuous mathematics. Meaning representation is a central issue in Natural Language Pr...
In TREC 2004, the Database and Information System Lab (DBIS) at University of Illinois at Chicago (UIC) participates in the robust track, which is a traditional ad hoc retrieval t...
The bottleneck for dictionary-based cross-language information retrieval is the lack of comprehensive dictionaries, in particular for many different languages. We here introduce a...