Many machine learning methods have recently been applied to natural language processing tasks. Among them, the Winnow algorithm has been argued to be particularly suitable for NLP...
In this paper, a new language model, the Multi-Class Composite N-gram, is proposed to avoid a data sparseness problem for spoken language in that it is difficult to collect traini...
We present a syntax-based statistical translation model. Our model transforms a source-language parse tree into a target-language string by applying stochastic operations at each ...
We describe a set of supervised machine learning experiments centering on the construction of statistical models of WH-questions. These models, which are built from shallow lingui...
This paper describes the application of the PARADISE evaluation framework to the corpus of 662 human-computer dialogues collected in the June 2000 Darpa Communicator data collecti...
Marilyn A. Walker, Rebecca J. Passonneau, Julie E....
We propose a statistical method that finds the maximum-probability segmentation of a given text. This method does not require training data because it estimates probabilities from...
One of the central issues for information extraction (IE) systems is the cost of customization from one scenario to another. Research on the automated acquisition of patterns is i...
The standard pipeline approach to semantic processing, in which sentences are morphologically and syntactically resolved to a single tree before they are interpreted, is a poor fi...
In a headed tree, each terminal word can be uniquely labeled with a governing word and grammatical relation. This labeling is a summary of a syntactic analysis which eliminates de...
We describe a biographical multidocument summarizer that summarizes information about people described in the news. The summarizer uses corpus statistics along with linguistic kno...
Barry Schiffman, Inderjeet Mani, Kristian J. Conce...