
13 years 11 months ago
Unsupervised Learning of Arabic Stemming Using a Parallel Corpus
This paper presents an unsupervised learning approach to building a non-English (Arabic) stemmer. The stemming model is based on statistical machine translation and it uses an Eng...
Monica Rogati, J. Scott McCarley, Yiming Yang
13 years 11 months ago
Evaluation Challenges in Large-Scale Document Summarization
We present a large-scale meta evaluation of eight evaluation measures for both single-document and multi-document summarizers. To this end we built a corpus consisting of (a) 100 ...
Dragomir R. Radev, Simone Teufel, Horacio Saggion,...
13 years 11 months ago
A Tabulation-Based Parsing Method that Reduces Copying
This paper presents a new bottom-up chart parsing algorithm for Prolog along with a compilation procedure that reduces the amount of copying at run-time to a constant number (2) p...
Gerald Penn, Cosmin Munteanu
13 years 11 months ago
Text Chunking by Combining Hand-Crafted Rules and Memory-Based Learning
This paper proposes a hybrid of handcrafted rules and a machine learning method for chunking Korean. In the partially free word-order languages such as Korean and Japanese, a smal...
Seong-Bae Park, Byoung-Tak Zhang
13 years 11 months ago
Constructing Semantic Space Models from Parsed Corpora
Traditional vector-based models use word co-occurrence counts from large corpora to represent lexical meaning. In this paper we present a novel approach for constructing semantic ...
Sebastian Padó, Mirella Lapata
13 years 11 months ago
Towards a Model of Face-to-Face Grounding
We investigate the verbal and nonverbal means for grounding, and propose a design for embodied conversational agents that relies on both kinds of signals to establish common groun...
Yukiko I. Nakano, Gabe Reinstein, Tom Stocky, Just...
13 years 11 months ago
Minimum Error Rate Training in Statistical Machine Translation
Often, the training procedure for statistical machine translation models is based on maximum likelihood or related criteria. A general problem of this approach is that there is on...
Franz Josef Och
13 years 11 months ago
Syntactic Features and Word Similarity for Supervised Metonymy Resolution
We present a supervised machine learning algorithm for metonymy resolution, which exploits the similarity between examples of conventional metonymy. We show that syntactic head-mo...
Malvina Nissim, Katja Markert
13 years 11 months ago
Exploiting Parallel Texts for Word Sense Disambiguation: An Empirical Study
A central problem of word sense disambiguation (WSD) is the lack of manually sense-tagged data required for supervised learning. In this paper, we evaluate an approach to automati...
Hwee Tou Ng, Bin Wang, Yee Seng Chan