This paper presents an unsupervised learning approach to building a non-English (Arabic) stemmer. The stemming model is based on statistical machine translation and it uses an Eng...
In Chinese, phrases and named entities play a central role in information retrieval. Abbreviations, however, make keyword-based approaches less effective. This paper presents an em...
We present a new model for detection of noun phrases in unrestricted text, whose most outstanding feature is its flexibility: the system is able to recognize noun phrases similar ...
Mining bilingual data (including bilingual sentences and terms1 ) from the Web can benefit many NLP applications, such as machine translation and cross language information retrie...
Long Jiang, Shiquan Yang, Ming Zhou, Xiaohua Liu, ...
nt formal speci cations of a new abstraction, weak sets, which can be used to alleviate high latencies when retrieving data from a wide-area information system like the World Wide...