Sciweavers

NLDB
2007
Springer

Character N-Grams Translation in Cross-Language Information Retrieval

14 years 6 months ago
Character N-Grams Translation in Cross-Language Information Retrieval
Abstract. This paper describes a new technique for the direct translation of character n-grams for use in Cross-Language Information Retrieval systems. This solution avoids the need for word normalization during indexing or translation, and it can also deal with out-of-vocabulary words. This knowledge-light approach does not rely on language-specific processing, and it can be used with languages of very different natures even when linguistic information and resources are scarce or unavailable. Our proposal also tries to achieve a higher speed during the n-gram alignment process with respect to previous approaches. Key words: Cross-Language Information Retrieval, character n-grams, translation algorithms, alignment algorithms, association measures.
Jesús Vilares, Michael P. Oakes, Manuel Vil
Added 08 Jun 2010
Updated 08 Jun 2010
Type Conference
Year 2007
Where NLDB
Authors Jesús Vilares, Michael P. Oakes, Manuel Vilares Ferro
Comments (0)