Visually and phonologically similar characters are major contributing factors for errors in Chinese text. By defining appropriate similarity measures that consider extended Cangji...
We present a simple algorithm for clustering semantic patterns based on distributional similarity and use cluster memberships to guide semi-supervised pattern discovery. We apply ...
In this paper the development of an opinion summarization system that works on Bengali News corpus has been described. The system identifies the sentiment information in each docu...
We present a probabilistic generative model for learning semantic parsers from ambiguous supervision. Our approach learns from natural language sentences paired with world states ...
An unsupervised discriminative training procedure is proposed for estimating a language model (LM) for machine translation (MT). An English-to-English synchronous context-free gra...
Zhifei Li, Ziyuan Wang, Sanjeev Khudanpur, Jason E...
Our goal is to propose a description model for the lexicon. We describe a software framework for representing the lexicon and its variations called Proteus. Various examples show ...
We present a general methodology for extracting multi-word expressions (of various types), along with their translations, from small parallel corpora. We automatically align the p...
There exists a well-established and almost unanimously adopted measure of tagger performance, namely, accuracy. Although it is perfectly adequate for small tagsets and typical app...