Over the past five years, the Defense Advanced Research Projects Agency (DARPA) has funded development of speech translation systems for tactical applications. A key component of ...
Sherri L. Condon, Jon Phillips, Christy Doran, Joh...
In this paper we describe the methodology and the first steps for the creation of WNTERM (from WordNet and Terminology), a specialized lexicon produced from the merger of the Euro...
Eli Pociello, Antton Gurrutxaga, Eneko Agirre, Iza...
In this article we present a method for combining different information retrieval models in order to increase the retrieval performance in a Speech Information Retrieval task. The...
This paper discusses the design, recording and preprocessing of a Czech sign language corpus. The corpus is intended for training and testing of sign language recognition (SLR) sy...
We report on an on-going research project aimed at increasing the range of translation equivalents which can be automatically discovered by MT systems. The methodology is based on...
The NIST Automatic Content Extraction (ACE) Evaluation expands its focus in 2008 to encompass the challenge of cross-document and cross-language global integration and reconciliat...
Stephanie Strassel, Mark A. Przybocki, Kay Peterso...
WCTAnalyze is a tool for storing, accessing and visually analyzing huge collections of temporally indexed data. It is motivated by applications in media analysis, business intelli...
Sebastian Gottwald, Matthias Richter, Gerhard Heye...
This paper proposes an extension of Sumida and Torisawa's method of acquiring hyponymy relations from hierachical layouts in Wikipedia (Sumida and Torisawa, 2008). We extract...
In this paper we present two experiments conducted for comparison of different language identification algorithms. Short words-, frequent words- and n-gram-based approaches are co...
Lena Grothe, Ernesto William De Luca, Andreas N&uu...
The paper describes the project held within Russian National Corpus (http://www.ruscorpora.ru). Beside such obligatory constituents of a linguistic corpus as POS (parts of speech)...