Sciweavers

IRAL
2003
ACM

Issues in pre- and post-translation document expansion: untranslatable cognates and missegmented words

14 years 5 months ago
Issues in pre- and post-translation document expansion: untranslatable cognates and missegmented words
Query expansion by pseudo-relevance feedback is a well-established technique in both mono- and cross- lingual information retrieval, enriching and disambiguating the typically terse queries provided by searchers. Comparable document-side expansion is a relatively more recent development motivated by error-prone transcription and translation processes in spoken document and cross-language retrieval. In the cross-language case, one can perform expansion before translation, after translation, and at both points. We investigate the relative impact of pre- and post- translation document expansion for cross-language spoken document retrieval in Mandarin Chinese. We find that posttranslation expansion yields a highly significant improvement in retrieval effectiveness, while improvements due to pretranslation expansion alone or in combination do not reach significance. We identify two key factors of segmentation and translation in Chinese orthography that limit the effectiveness of pre-tra...
Gina-Anne Levow
Added 05 Jul 2010
Updated 05 Jul 2010
Type Conference
Year 2003
Where IRAL
Authors Gina-Anne Levow
Comments (0)