Sciweavers

NAACL
2007
13 years 10 months ago
Advances in the CMU/Interact Arabic GALE Transcription System
This paper describes the CMU/InterACT effort in developing an Arabic Automatic Speech Recognition (ASR) system for broadcast news and conversations within the GALE 2006 evaluation...
Mohamed Noamany, Thomas Schaaf, Tanja Schultz
NAACL
2007
13 years 10 months ago
K-Best Suffix Arrays
Kenneth Ward Church, Bo Thiesson, Robert Ragno
NAACL
2007
13 years 10 months ago
Combination of Statistical Word Alignments Based on Multiple Preprocessing Schemes
We present an approach to using multiple preprocessing schemes to improve statistical word alignments. We show a relative reduction of alignment error rate of about 38%.
Jakob Elming, Nizar Habash
NAACL
2007
13 years 10 months ago
An Integrated Architecture for Speech-Input Multi-Target Machine Translation
The aim of this work is to show the ability of finite-state transducers to simultaneously translate speech into multiple languages. Our proposal deals with an extension of stocha...
Alicia Pérez, Maria-Teresa González,...
NAACL
2007
13 years 10 months ago
Comparing Wikipedia and German Wordnet by Evaluating Semantic Relatedness on Multiple Datasets
We evaluate semantic relatedness measures on different German datasets showing that their performance depends on: (i) the definition of relatedness that was underlying the constr...
Torsten Zesch, Iryna Gurevych, Max Mühlhä...
NAACL
2007
13 years 10 months ago
Tagging Icelandic Text using a Linguistic and a Statistical Tagger
We describe our linguistic rule-based tagger IceTagger, and compare its tagging accuracy to the TnT tagger, a state-of-theart statistical tagger, when tagging Icelandic, a morphol...
Hrafn Loftsson
NAACL
2007
13 years 10 months ago
A High Accuracy Method for Semi-Supervised Information Extraction
Customization to specific domains of discourse and/or user requirements is one of the greatest challenges for today’s Information Extraction (IE) systems. While demonstrably eff...
Stephen Tratz, Antonio Sanfilippo
NAACL
2007
13 years 10 months ago
Detection of Non-Native Sentences Using Machine-Translated Training Data
Training statistical models to detect nonnative sentences requires a large corpus of non-native writing samples, which is often not readily available. This paper examines the exte...
John Lee, Ming Zhou, Xiaohua Liu
NAACL
2007
13 years 10 months ago
Are Very Large N-Best Lists Useful for SMT?
This paper describes an efficient method to extract large n-best lists from a word graph produced by a statistical machine translation system. The extraction is based on the k sh...
Sasa Hasan, Richard Zens, Hermann Ney