Sciweavers

LREC
2008
172views Education» more  LREC 2008»
13 years 9 months ago
CallSurf: Automatic Transcription, Indexing and Structuration of Call Center Conversational Speech for Knowledge Extraction and
Being the client's first interface, call centres worldwide contain a huge amount of information of all kind under the form of conversational speech. If accessible, this infor...
Martine Garnier-Rizet, Gilles Adda, Frederik Caill...
LREC
2008
102views Education» more  LREC 2008»
13 years 9 months ago
Extended Named Entity Ontology with Attribute Information
Named Entities (NE) are regarded as an important type of semantic knowledge in many natural language processing (NLP) applications. Originally, a limited number of NE categories w...
Satoshi Sekine
LREC
2008
109views Education» more  LREC 2008»
13 years 9 months ago
Creating Sentence-Aligned Parallel Text Corpora from a Large Archive of Potential Parallel Text using BITS and Champollion
Parallel text is one of the most valuable resources for development of statistical machine translation systems and other NLP applications. The Linguistic Data Consortium (LDC) has...
Kazuaki Maeda, Xiaoyi Ma, Stephanie Strassel
LREC
2008
123views Education» more  LREC 2008»
13 years 9 months ago
Evaluating Complement-Modifier Distinctions in a Semantically Annotated Corpus
We evaluate the extent to which the distinction between semantically core and non-core dependents as used in the FrameNet corpus corresponds to the traditional distinction between...
Mark McConville, Myroslava Dzikovska
LREC
2008
94views Education» more  LREC 2008»
13 years 9 months ago
The PIT Corpus of German Multi-Party Dialogues
The PIT corpus is a German multi-media corpus of multi-party dialogues recorded in a Wizard-of-Oz environment at the University of Ulm. The scenario involves two human dialogue pa...
Petra-Maria Strauß, Holger Hoffmann, Wolfgan...
LREC
2008
75views Education» more  LREC 2008»
13 years 9 months ago
Selection of Japanese-English Equivalents by Integrating High-quality Corpora and Huge Amounts of Web Data
As a first step to developing systems that enable non-native speakers to output near-perfect English sentences for given mixed EnglishJapanese sentences, we propose new approaches...
Qing Ma, Koichi Nakao, Masaki Murata, Hitoshi Isah...
LREC
2008
101views Education» more  LREC 2008»
13 years 9 months ago
Investigating the Structure of Procedural Texts for Answering How-to Questions
This paper presents ongoing work dedicated to parsing the textual structure of procedural texts. We propose here a model for the intructional structure and criteria to identify it...
Estelle Delpech, Patrick Saint-Dizier
LREC
2008
78views Education» more  LREC 2008»
13 years 9 months ago
A Grid of Regional Language Archives
About two years ago, the Max Planck Institute for Psycholinguistics in Nijmegen, The Netherlands, started an initiative to install regional language archives in various places aro...
Paul Trilsbeek, Daan Broeder, Tobias Valkenhoef, P...
LREC
2008
77views Education» more  LREC 2008»
13 years 9 months ago
Certification and Cleaning up of a Text Corpus: Towards an Evaluation of the "Grammatical" Quality of a Corpus
We present in this article the methods we used for obtaining measures to ensure the quality and well-formedness of a text corpus. These measures allow us to determine the compatib...
Cyril Grouin