

Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation

13 years 2 months ago
Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation
Previous work has shown that high quality phrasal paraphrases can be extracted from bilingual parallel corpora. However, it is not clear whether bitexts are an appropriate resource for extracting more sophisticated sentential paraphrases, which are more obviously learnable from monolingual parallel corpora. We extend bilingual paraphrase extraction to syntactic paraphrases and demonstrate its ability to learn a variety of general paraphrastic transformations, including passivization, dative shift, and topicalization. We discuss how our model can be adapted to many text generation tasks by augmenting its feature set, development data, and parameter estimation routine. We illustrate this adaptation by using our paraphrase model for the task of sentence compression and achieve results competitive with state-of-the-art compression systems.
Juri Ganitkevitch, Chris Callison-Burch, Courtney
Added 20 Dec 2011
Updated 20 Dec 2011
Type Journal
Year 2011
Authors Juri Ganitkevitch, Chris Callison-Burch, Courtney Napoles, Benjamin Van Durme
Comments (0)