Information retrieval tools and search engines have mainly been leveraging research results and technologies developed for the English language. In this paper we report the issues...
This paper presents an interdisciplinary investigation of statistical information retrieval (IR) techniques for protein identification from tandem mass spectra, a challenging probl...
Language identification is the task of identifying the language a given document is written in. This paper describes a detailed examination of what models perform best under diffe...
In this paper we present two experiments conducted for comparison of different language identification algorithms. Short words-, frequent words- and n-gram-based approaches are co...
Lena Grothe, Ernesto William De Luca, Andreas N&uu...
A variety of different scripts are used in writing languages throughout the world. In a multi-script, multilingual environment, it is essential to know the script used in writing a...