Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

123

COLING
2008

108views Computational Linguistics» more COLING 2008»

Source Language Markers in EUROPARL Translations

15 years 8 months ago

Source Language Markers in EUROPARL Translations

Download www.aclweb.org

This paper shows that it is very often possible to identify the source language of medium-length speeches in the EUROPARL corpus on the basis of frequency counts of word n-grams (87.2%96.7% accuracy depending on classification method). The paper also examines in detail which positive markers are most powerful and identifies a number of linguistic aspects as well as culture- and domain-related ones.1

Hans van Halteren

Real-time Traffic

COLING 2008 | Computational Linguistics | Frequency Counts | Medium-length Speeches | Word N-grams |

claim paper

Related Content

» OpenMaTrEx A FreeOpenSource MarkerDriven ExampleBased Machine Translation System

» A Comparison of Pivot Methods for PhraseBased Statistical Machine Translation

» Semisupervised model adaptation for statistical machine translation

» Sinuhe Statistical Machine Translation using a Globally Trained Conditional Exponential F...

» Evaluation of a Machine Translation System for Low Resource Languages METISII

» Combining Morphemebased Machine Translation with Postprocessing Morpheme Prediction

» Learning to Predict Case Markers in Japanese

» Automatic Inference of the Temporal Location of Situations in Chinese Text

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	COLING
Authors	Hans van Halteren

Comments (0)