WeproposeanewapproachtoEMlearning of PCFGs. We completely separate the process of EM learning from that of parsing, andfor theformer, weintroduce a new EM algorithm called the gra...
We study the problem of topic segmentation of manually transcribed speech in order to facilitate information extraction from dialogs. Our approach is based on a combination of mul...
in machine translation, long sentences are usually assumed to be difficult to treat. The main reason is the syntactic ambiguity which increases explosively as a sentence become lo...
Yoon-Hyung Roh, Young Ae Seo, Ki-Young Lee, Sung-K...
We propose a simple two-level hierarchical probability model for unsupervised word segmentation. By treating words as strings composed of morphemes/phonemes which are themselves c...
This paper describes a text generation system, XExplainer, which can dynamically produce a description of commodities in Korean from a relational database for homeshopping sites. ...
This paper presents a method for incorporating natural language processing into existing text categorization procedures. Three aspects are considered in the investigation: (i) a m...
We describe a simple improvement to ngram language models where we estimate the distribution over closed-class (function) words separately from the conditional distribution of ope...