In this paper we describe a biography summarization system using sentence classification and ideas from information retrieval. Although the individual techniques are not new, asse...
In this paper, we describe a resource-light system for the automatic morphological analysis and tagging of Russian. We eschew the use of extensive resources (particularly, large a...
A method for automatic plot analysis of narrative texts that uses components of both traditional symbolic analysis of natural language and statistical machine-learning is presente...
We apply statistical machine translation (SMT) tools to generate novel paraphrases of input sentences in the same language. The system is trained on large volumes of sentence pair...
Multidocument extractive summarization relies on the concept of sentence centrality to identify the most important sentences in a document. Centrality is typically defined in term...
Most statistical parsers have used the grammar induction approach, in which a stochastic grammar is induced from a treebank. An alternative approach is to induce a controller for ...
If two translation systems differ differ in performance on a test set, can we trust that this indicates a difference in true system quality? To answer this question, we describe b...
Given a parallel parsed corpus, statistical treeto-tree alignment attempts to match nodes in the syntactic trees for a given sentence in two languages. We train a probabilistic tr...