Using Parsed Corpora for Structural Disambiguation in the TRAINS Domain

14 years 27 days ago

Download www.aclweb.org

This paper describes a prototype disambiguation module, KANKEI, which was tested on two corpora of the TRAINS project. In ambiguous verb phrases of form V ... NP PP or V ... NP adverb(s), the two corpora have very different PP and adverb attachment patterns; in the first, the correct attachment is to the VP 88.7% of the time, while in the second, the correct attachment is to the NP 73.5% of the time. KANKEI uses various n-gram patterns of the phrase heads around these ambiguities, and assigns parse trees (with these ambiguities) a score based on a linear combination of the frequencies with which these patterns appear with NP and VP attachments in the TRAINS corpora. Unlike previous statistical disambiguation systems, this technique thus combines evidence from bigrams, trigrams, and the 4-gram around an ambiguous attachment. In the current experiments, equal weights are used for simplicity but results are still good on the TRAINS corpora (92.2% and 92.4% accuracy). Despite the large st...

Mark G. Core

Real-time Traffic

ACL 1996 | ACL 2007 | Correct Attachment | NP PP | TRAINS Corpora |

claim paper

Post Info
More Details (n/a)

Added	02 Nov 2010
Updated	02 Nov 2010
Type	Conference
Year	1996
Where	ACL
Authors	Mark G. Core

Comments (0)

Sciweavers

Using Parsed Corpora for Structural Disambiguation in the TRAINS Domain

ACL 1996 | ACL 2007 | Correct Attachment | NP PP | TRAINS Corpora |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers