NLGbAse: A Free Linguistic Resource for Natural Language Processing Systems

15 years 5 months ago

Download www.lrec-conf.org

Availability of labeled language resources, such as annotated corpora and domain dependent labeled language resources is crucial for experiments in the ﬁeld of Natural Language Processing. Most often, due to lack of resources, manual veriﬁcation and annotation of electronic text material is a prerequisite for the development of NLP tools. In the context of under-resourced language, the lack of copora becomes a crucial problem because most of the research efforts are supported by organizations with limited funds. Using free, multilingual and highly structured corpora like Wikipedia to produce automatically labeled language resources can be an answer to those needs. This paper introduces NLGbAse, a multilingual linguistic resource built from the Wikipedia encyclopedic content. This system produces structured metadata which make possible the automatic annotation of corpora with syntactical and semantical labels. A metadata contains semantical and statistical informations related to a...

Eric Charton, Juan Manuel Torres Moreno

Real-time Traffic