First- and Second-Order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests

15 years 4 months ago

Download cs.jhu.edu

Many statistical translation models can be regarded as weighted logical deduction. Under this paradigm, we use weights from the expectation semiring (Eisner, 2002), to compute first-order statistics (e.g., the expected hypothesis length or feature counts) over packed forests of translations (lattices or hypergraphs). We then introduce a novel second-order expectation semiring, which computes second-order statistics (e.g., the variance of the hypothesis length or the gradient of entropy). This second-order semiring is essential for many interesting training paradigms such as minimum risk, deterministic annealing, active learning, and semi-supervised learning, where gradient descent optimization requires computing the gradient of entropy or risk. We use these semirings in an open-source machine translation toolkit, Joshua, enabling minimum-risk training

Zhifei Li, Jason Eisner

Real-time Traffic

EMNLP 2009 | Gradient | Hypothesis Length | Many Statistical Translation | Natural Language Processing |

claim paper

Post Info
More Details (n/a)

Added	17 Feb 2011
Updated	17 Feb 2011
Type	Journal
Year	2009
Where	EMNLP
Authors	Zhifei Li, Jason Eisner

Comments (0)

Sciweavers

First- and Second-Order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests

EMNLP 2009 | Gradient | Hypothesis Length | Many Statistical Translation | Natural Language Processing |

Explore & Download

Productivity Tools

Sciweavers