Hitting the Right Paraphrases in Good Time

15 years 4 months ago

Download alchemy.cs.washington.edu

We present a random-walk-based approach to learning paraphrases from bilingual parallel corpora. The corpora are represented as a graph in which a node corresponds to a phrase, and an edge exists between two nodes if their corresponding phrases are aligned in a phrase table. We sample random walks to compute the average number of steps it takes to reach a ranking of paraphrases with better ones being "closer" to a phrase of interest. This approach allows "feature" nodes that represent domain knowledge to be built into the graph, and incorporates truncation techniques to prevent the graph from growing too large for efficiency. Current approaches, by contrast, implicitly presuppose the graph to be bipartite, are limited to finding paraphrases that are of length two away from a phrase, and do not generally permit easy incorporation of domain knowledge. Manual evaluation of generated output shows that our approach outperforms the state-of-the-art system of Callison-Bur...

Stanley Kok, Chris Brockett

Real-time Traffic

Bilingual Parallel Corpora | Computational Linguistics | Domain Knowledge | NAACL 2010 | Paraphrases |

claim paper

Post Info
More Details (n/a)

Added	14 Feb 2011
Updated	14 Feb 2011
Type	Journal
Year	2010
Where	NAACL
Authors	Stanley Kok, Chris Brockett

Comments (0)

Sciweavers

Hitting the Right Paraphrases in Good Time

Bilingual Parallel Corpora | Computational Linguistics | Domain Knowledge | NAACL 2010 | Paraphrases |

Explore & Download

Productivity Tools

Sciweavers