Sciweavers

LREC
2008
82views Education» more  LREC 2008»
14 years 8 days ago
Towards the National Corpus of Polish
This paper presents a new corpus project, aiming at building a national corpus of Polish. What makes it different from a typical YACP (Yet Another Corpus Project) is 1) the fact t...
Adam Przepiórkowski, Rafal L. Górski...
LREC
2008
85views Education» more  LREC 2008»
14 years 8 days ago
Some Fine Points of Hybrid Natural Language Parsing
Large-scale grammar-based parsing systems nowadays increasingly rely on independently developed, more specialized components for pre-processing their input. However, different too...
Peter Adolphs, Stephan Oepen, Ulrich Callmeier, Be...
LREC
2008
98views Education» more  LREC 2008»
14 years 8 days ago
Cross-Corpus Evaluation of Word Alignment
We present the procedures we implemented to carry out system oriented evaluation of a syntax-based word aligner --ALIBI. We take the approach of regarding cross-corpus evaluation ...
Sylwia Ozdowska
LREC
2008
146views Education» more  LREC 2008»
14 years 8 days ago
Lexical Ontology Extraction using Terminology Analysis: Automating Video Annotation
The majority of work described in this paper was conducted as part of the Recovering Evidence from Video by fusing Video Evidence Thesaurus and Video MetaData (REVEAL) project, sp...
Neil Newbold, Bogdan Vrusias, Lee Gillam
LREC
2008
119views Education» more  LREC 2008»
14 years 8 days ago
Text Independent Speaker Identification in Multilingual Environments
Speaker identification and verification systems have a poor performance when model training is done in one language while the testing is done in another. This situation is not unu...
Iker Luengo, Eva Navas, Iñaki Sainz, Ibon S...
LREC
2008
113views Education» more  LREC 2008»
14 years 8 days ago
Automatic Rewriting of Patient Record Narratives
Patients require access to Electronic Patient Records, however medical language is often too difficult for patients to understand. Explaining records to patients is a time consumi...
Catalina Hallett, David Hardcastle
LREC
2008
91views Education» more  LREC 2008»
14 years 8 days ago
Diacritic Annotation in the Arabic Treebank and its Impact on Parser Evaluation
The Arabic Treebank (ATB), released by the Linguistic Data Consortium, contains multiple annotation files for each source file, due in part to the role of diacritic inclusion in t...
Mohamed Maamouri, Seth Kulick, Ann Bies
LREC
2008
123views Education» more  LREC 2008»
14 years 8 days ago
Identification of Naturally Occurring Numerical Expressions in Arabic
In this paper, we define the task of Number Identification in natural context. We present and validate a language-independent semiautomatic approach to quickly building a gold sta...
Nizar Habash, Ryan Roth
LREC
2008
130views Education» more  LREC 2008»
14 years 8 days ago
COLDIC, a Lexicographic Platform for LMF compliant lexica
Despite of the importance of lexical resources for a number of NLP applications (Machine Translation, Information Extraction, Event Detection and Tracking, Question Answering, amo...
Núria Bel, Sergio Espeja, Montserrat Marimo...
LREC
2008
65views Education» more  LREC 2008»
14 years 8 days ago
A Three-stage Disfluency Classifier for Multi Party Dialogues
We present work on a three-stage system to detect and classify disfluencies in multi party dialogues. The system consists of a regular expression based module and two machine lear...
Margot Mieskes, Michael Strube