Sciweavers

EMNLP
2008

Syntactic Constraints on Paraphrases Extracted from Parallel Corpora

14 years 27 days ago
Syntactic Constraints on Paraphrases Extracted from Parallel Corpora
We improve the quality of paraphrases extracted from parallel corpora by requiring that phrases and their paraphrases be the same syntactic type. This is achieved by parsing the English side of a parallel corpus and altering the phrase extraction algorithm to extract phrase labels alongside bilingual phrase pairs. In order to retain broad coverage of non-constituent phrases, complex syntactic labels are introduced. A manual evaluation indicates a 19% absolute improvement in paraphrase quality over the baseline method.
Chris Callison-Burch
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where EMNLP
Authors Chris Callison-Burch
Comments (0)