Sciweavers

LREC
2008
135views Education» more  LREC 2008»
13 years 10 months ago
Communicating Unknown Words in Machine Translation
A new approach to handle unknown words in machine translation is presented. The basic idea is to find definitions for the unknown words on the source language side and translate t...
Matthias Eck, Stephan Vogel, Alex Waibel
LREC
2008
141views Education» more  LREC 2008»
13 years 10 months ago
New Resources for Document Classification, Analysis and Translation Technologies
The goal of the DARPA MADCAT (Multilingual Automatic Document Classification Analysis and Translation) Program is to automatically convert foreign language text images into Englis...
Stephanie Strassel, Lauren Friedman, Safa Ismael, ...
LREC
2008
85views Education» more  LREC 2008»
13 years 10 months ago
Detecting Errors in Semantic Annotation
We develop a method for detecting errors in semantic predicate-argument annotation, based on the variation n-gram error detection method. After establishing an appropriate data re...
Markus Dickinson, Chong Min Lee
LREC
2008
124views Education» more  LREC 2008»
13 years 10 months ago
Acquiring a Poor Man's Inflectional Lexicon for German
Many NLP modules and applications require the availability of a module for wide-coverage inflectional analysis. One way to obtain such analyses is to use an morphological analyser...
Peter Adolphs
LREC
2008
163views Education» more  LREC 2008»
13 years 10 months ago
Unsupervised Acquisition of Verb Subcategorization Frames from Shallow-Parsed Corpora
In this paper, we reported experiments of unsupervised automatic acquisition of Italian and English verb subcategorization frames (SCFs) from general and domain corpora. The propo...
Alessandro Lenci, Barbara McGillivray, Simonetta M...
LREC
2008
70views Education» more  LREC 2008»
13 years 10 months ago
An Approach to Modeling Heterogeneous Resources for Information Extraction
In this paper, we describe an approach that aims to model heterogeneous resources for information extraction. Document is modeled in graph representation that enables better under...
Lei Xia, José Iria
LREC
2008
107views Education» more  LREC 2008»
13 years 10 months ago
Developing a Phonemic and Syllabic Frequency Inventory for Spontaneous Spoken Castilian Spanish and their Comparison to Text-Bas
In this paper we present our recent work to develop phonemic and syllabic inventories for Castilian Spanish based on the C-ORAL-ROM corpus, a spontaneous spoken Spanish with varyi...
Antonio Moreno-Sandoval, Doroteo Torre Toledano, R...
LREC
2008
150views Education» more  LREC 2008»
13 years 10 months ago
Automatic Translation of Biomedical Terms by Supervised Machine Learning
In this paper, we present a simple yet efficient automatic system to translate biomedical terms. It mainly relies on a machine learning approach able to infer rewriting rules from...
Vincent Claveau
LREC
2008
165views Education» more  LREC 2008»
13 years 10 months ago
Design and Data Collection for Spoken Polish Dialogs Database
Spoken corpora provide a critical resource for research, development and evaluation of spoken dialog systems. This paper describes the telephone spoken dialog corpus for Polish cr...
Krzysztof Marasek, Ryszard Gubrynowicz
LREC
2008
85views Education» more  LREC 2008»
13 years 10 months ago
Construction of a Metadata Database for Efficient Development and Use of Language Resources
The National Institute of Information and Communications Technology (NICT) and Nagoya University have been jointly constructing a large scale database named SHACHI by collecting d...
Hitomi Tohyama, Shunsuke Kozawa, Kiyotaka Uchimoto...