We propose a computational model of text reuse tailored for ancient literary texts, available to us often only in small and noisy samples. The model takes into account source alte...
We introduce a simple method to pack words for statistical word alignment. Our goal is to simplify the task of automatic word alignment by packing several consecutive words togeth...
This paper introduces a Maximum Entropy dependency parser based on an efficient kbest Maximum Spanning Tree (MST) algorithm. Although recent work suggests that the edge-factored ...
Topic segmentation and identification are often tackled as separate problems whereas they are both part of topic analysis. In this article, we study how topic identification can...
This paper proposes a novel approach to the induction of Combinatory Categorial Grammars (CCGs) by their potential affinity with the Genetic Algorithms (GAs). Specifically, CCGs...
This paper describes the latest version of speech-to-speech translation systems developed by the team of NICT-ATR for over twenty years. The system is now ready to be deployed for...
This paper presents an application of PageRank, a random-walk model originally devised for ranking Web search results, to ranking WordNet synsets in terms of how strongly they pos...
We present an approach to query expansion in answer retrieval that uses Statistical Machine Translation (SMT) techniques to bridge the lexical gap between questions and answers. S...
Stefan Riezler, Alexander Vasserman, Ioannis Tsoch...
In this paper we investigate named entity transliteration based on a phonetic scoring method. The phonetic method is computed using phonetic features and carefully designed pseudo...