Translation of proper names is generally recognized as a significant problem in many multi-lingual text and speech processing applications. Even when large bilingual lexicons use...
This paper proposes a chunking strategy to detect unknown words in Chinese word segmentation. First, a raw sentence is pre-segmented into a sequence of word atoms 1 using a maximum...
Lattice graphs are used as underlying data structures in many statistical processing systems, including natural language processing. Lattices compactly represent multiple possible...
Christopher Collins, M. Sheelagh T. Carpendale, Ge...
Abstract. Automatic document summarization has become increasingly important due to the quantity of written material generated worldwide. Generating good quality summaries enables ...
Judith D. Schlesinger, Dianne P. O'Leary, John M. ...
We present a new phrase-based conditional exponential family translation model for statistical machine translation. The model operates on a feature representation in which sentenc...