This paper describes the participation of UAIC team at the LogCLEF 2011 initiative, language identification task. Our approach is an aggregation of known methods for recognizing la...
There are many accurate methods for language identification of long text samples, but identification of very short strings still presents a challenge. This paper studies a languag...
Language identification is the task of identifying the language a given document is written in. This paper describes a detailed examination of what models perform best under diffe...
The task of identifying the language of text or utterances has a number of applications in natural language processing. Language identification has traditionally been approached w...
Language Identification (LID) refers to the task of identifying an unknown language from the test utterances. In this paper, a new method of feature extraction, viz., Teager Energy...
In our participation to the 2010 LogCLEF track we focused on the analysis of the European Library (TEL) logs and in particular we experimented with the identification of the natura...
Two major stages stages in language identification systems can be identified: the language modeling stage, where the distinctive features of languages are determined and stored in...
In this paper we present two experiments conducted for comparison of different language identification algorithms. Short words-, frequent words- and n-gram-based approaches are co...
Lena Grothe, Ernesto William De Luca, Andreas N&uu...
As the data for more and more languages is finding its way into digital form, with an increasing amount of this data being posted to the Web, it has become possible to collect lan...
Existing Language Identification (LID) approaches do reach 100% precision, in most common situations, when dealing with documents written in just one language, and when those docu...