A Comparative Study on Language Identification Methods

15 years 8 months ago

Download www.lrec-conf.org

In this paper we present two experiments conducted for comparison of different language identification algorithms. Short words-, frequent words- and n-gram-based approaches are considered and combined with the Ad-Hoc Ranking classification method. The language identification process can be subdivided into two main steps: First a document model is generated for the document and a language model for the language; second the language of the document is determined on the basis of the language model and is added to the document as additional information. In this work we present our evaluation results and discuss the importance of a dynamic value for the out-of-place measure.

Lena Grothe, Ernesto William De Luca, Andreas N&uu

Real-time Traffic

Education | Language Identification | Language Identification Algorithms | Language Model | LREC 2008 |

claim paper

» A comparative study of methods for estimating query language models with pseudo feedback

» A Comparative Study of Parameter Estimation Methods for Statistical Natural Language Proce...

» Identification of LatinBased Languages through Character Stroke Categorization

» Methods for Classifying Spot Welding Processes A Comparative Study of Performance

» Romanian Zero Pronoun Distribution A Comparative Study

» Language Identification of Short Text Segments with Ngram Models

» A Comparative Study of Energy Minimization Methods for Markov Random Fields

» Comparative analysis of five proteinprotein interaction corpora

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	LREC
Authors	Lena Grothe, Ernesto William De Luca, Andreas Nürnberger

Comments (0)

Sciweavers

A Comparative Study on Language Identification Methods

Education | Language Identification | Language Identification Algorithms | Language Model | LREC 2008 |

Explore & Download

Productivity Tools

Sciweavers