Sciweavers

COLING
2008

Source Language Markers in EUROPARL Translations

14 years 1 months ago
Source Language Markers in EUROPARL Translations
This paper shows that it is very often possible to identify the source language of medium-length speeches in the EUROPARL corpus on the basis of frequency counts of word n-grams (87.2%96.7% accuracy depending on classification method). The paper also examines in detail which positive markers are most powerful and identifies a number of linguistic aspects as well as culture- and domain-related ones.1
Hans van Halteren
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where COLING
Authors Hans van Halteren
Comments (0)