Sciweavers

LREC
2010
169views Education» more  LREC 2010»
13 years 7 months ago
Language Identification of Short Text Segments with N-gram Models
There are many accurate methods for language identification of long text samples, but identification of very short strings still presents a challenge. This paper studies a languag...
Tommi Vatanen, Jaakko J. Väyrynen, Sami Virpi...
LREC
2010
115views Education» more  LREC 2010»
13 years 10 months ago
Mining Naturally-occurring Corrections and Paraphrases from Wikipedia's Revision History
Naturally-occurring instances of linguistic phenomena are important both for training and for evaluating automatic text processing. When available in large quantities, they also p...
Aurélien Max, Guillaume Wisniewski
LREC
2010
145views Education» more  LREC 2010»
13 years 11 months ago
Morphological Annotation of Quranic Arabic
Kais Dukes, Nizar Habash
LREC
2010
155views Education» more  LREC 2010»
13 years 11 months ago
Virtual Language Observatory: The Portal to the Language Resources and Technology Universe
Over the years, the field of Language Resources and Technology (LRT) has developed a tremendous amount of resources and tools. However, there is no ready-to-use map that researche...
Dieter Van Uytvanck, Claus Zinn, Daan Broeder, Pet...
LREC
2010
156views Education» more  LREC 2010»
13 years 11 months ago
A Description of Morphological Features of Serbian: a Revision using Feature System Declaration
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered...
Cvetana Krstev, Ranka Stankovic, Dusko Vitas
LREC
2010
159views Education» more  LREC 2010»
13 years 11 months ago
The Web Library of Babel: evaluating genre collections
We present experiments in automatic genre classification on web corpora, comparing a wide variety of features on several different genreannotated datasets (HGC, I-EN, KI-04, KRYS...
Serge Sharoff, Zhili Wu, Katja Markert
LREC
2010
157views Education» more  LREC 2010»
13 years 11 months ago
Senso Comune
Alessandro Oltramari, Guido Vetere, Maurizio Lenze...
LREC
2010
175views Education» more  LREC 2010»
13 years 11 months ago
Creating a Reusable English-Chinese Parallel Corpus for Bilingual Dictionary Construction
This paper first describes an experiment to construct an English-Chinese parallel corpus, then applying the Uplug word alignment tool on the corpus and finally produce and evaluat...
Hercules Dalianis, Hao-chun Xing, Xin Zhang
LREC
2010
189views Education» more  LREC 2010»
13 years 11 months ago
NLGbAse: A Free Linguistic Resource for Natural Language Processing Systems
Availability of labeled language resources, such as annotated corpora and domain dependent labeled language resources is crucial for experiments in the field of Natural Language ...
Eric Charton, Juan Manuel Torres Moreno