Sciweavers

LREC
2008
123views Education» more  LREC 2008»
13 years 9 months ago
Designing and Evaluating a Russian Tagset
This paper reports the principles behind designing a tagset to cover Russian morphosyntactic phenomena, modifications of the core tagset, and its evaluation. The tagset and associ...
Serge Sharoff, Mikhail Kopotev, Tomaz Erjavec, Ann...
LREC
2008
87views Education» more  LREC 2008»
13 years 9 months ago
Is this NE tagger getting old?
This paper focuses on the influence of changing the text time frame on the performance of a named entity tagger. We followed a twofold approach to investigate this subject: on the...
Cristina Mota, Ralph Grishman
LREC
2008
119views Education» more  LREC 2008»
13 years 9 months ago
Corpus and Voices for Catalan Speech Synthesis
In this paper we describe the design and production of Catalan database for building synthetic voices. Two speakers, with 10 hours per speaker, have recorded 10 hours of speech. T...
Antonio Bonafonte, Jordi Adell, Ignasi Esquerra, S...
LREC
2008
101views Education» more  LREC 2008»
13 years 9 months ago
Glossa: a Multilingual, Multimodal, Configurable User Interface
We describe a web-based corpus query system, Glossa, which combines the expressiveness of regular query languages with the user-friendliness of a graphical interface. Since corpus...
Lars Nygaard, Joel Priestley, Anders Nøkles...
LREC
2008
124views Education» more  LREC 2008»
13 years 9 months ago
Annotating an Arabic Learner Corpus for Error
This paper describes an ongoing project in which we are collecting a learner corpus of Arabic, developing a tagset for error annotation and performing Computer-aided Error Analysi...
Ghazi Abuhakema, Reem Faraj, Anna Feldman, Eileen ...
LREC
2008
108views Education» more  LREC 2008»
13 years 9 months ago
Creating a Research Collection of Question Answer Sentence Pairs with Amazon's Mechanical Turk
Each year NIST releases a set of question, document id, answer-triples for the factoid questions used in the TREC Question Answering track. While this resource is widely used and ...
Michael Kaisser, John Lowe
LREC
2008
82views Education» more  LREC 2008»
13 years 9 months ago
An eRulemaking Corpus: Identifying Substantive Issues in Public Comments
We describe the creation of a corpus that supports a real-world hierarchical text categorization task in the domain of electronic rulemaking (eRulemaking). Features of the task an...
Claire Cardie, Cynthia Farina, Matt Rawding, Adil ...
LREC
2008
122views Education» more  LREC 2008»
13 years 9 months ago
A Taxonomy of Lexical Metadata Categories
Metadata registries comprising sets of categories to be used in data collections exist in many fields. The purpose of a metadata registry is to facilitate data exchange and intero...
Bodil Nistrup Madsen, Hanne Erdman Thomsen
LREC
2008
110views Education» more  LREC 2008»
13 years 9 months ago
Unsupervised and Domain Independent Ontology Learning: Combining Heterogeneous Sources of Evidence
Acquiring knowledge from the Web to build domain ontologies has become a common practice in the Ontological Engineering field. The vast amount of freely available information allo...
David Manzano-Macho, Asunción Gómez-...
LREC
2008
91views Education» more  LREC 2008»
13 years 9 months ago
The BNC Parsed with RASP4UIMA
We have integrated the RASP system with the UIMA framework (RASP4UIMA) and used this to parse the XML-encoded version of the British National Corpus (BNC). All original annotation...
Øistein E. Andersen, Julien Nioche, Ted Bri...