Supervised and unsupervised PCFG adaptation to novel domains

14 years 5 months ago

Download acl.ldc.upenn.edu

This paper investigates adapting a lexicalized probabilistic context-free grammar (PCFG) to a novel domain, using maximum a posteriori (MAP) estimation. The MAP framework is general enough to include some previous model adaptation approaches, such as corpus mixing in Gildea (2001), for example. Other approaches falling within this framework are more effective. In contrast to the results in Gildea (2001), we show F-measure parsing accuracy gains of as much as 2.5% for high accuracy lexicalized parsing through the use of out-of-domain treebanks, with the largest gains when the amount of indomain data is small. MAP adaptation can also be based on either supervised or unsupervised adaptation data. Even when no in-domain treebank is available, unsupervised techniques provide a substantial accuracy gain over unadapted grammars, as much as nearly 5% F-measure improvement.

Brian Roark, Michiel Bacchiani

Real-time Traffic

Accuracy Gain | NAACL 2003 | NAACL 2007 | Parsing Accuracy Gains | Unsupervised Adaptation Data |

claim paper

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2003
Where	NAACL
Authors	Brian Roark, Michiel Bacchiani

Comments (0)

Sciweavers

Supervised and unsupervised PCFG adaptation to novel domains

Accuracy Gain | NAACL 2003 | NAACL 2007 | Parsing Accuracy Gains | Unsupervised Adaptation Data |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers