Error-Driven Pruning of Treebank Grammars for Base Noun Phrase Identification

15 years 8 months ago

Download reference.kfupm.edu.sa

Finding simple, non-recursive, base noun phrases is an important subtask for many natural language processing applications. While previous empirical methods for base NP identification have been rather complex, this paper instead proposes a very simple algorithm that is tailored to the relative simplicity of the task. In particular, we present a corpus-based approach for finding base NPs by matching part-ofspeech tag sequences. The training phase of the algorithm is based on two successful techniques: first the base NP grammar is read from a "treebank" corpus; then the grammar is improved by selecting rules with high "benefit" scores. Using this simple algorithm with a naive heuristic for matching rules, we achieve surprising accuracy in an evaluation on the Penn Treebank Wall Street Journal.

Claire Cardie, David R. Pierce

Real-time Traffic

ACL 1998 | ACL 2007 | Base Np | Base Np Identification | Simple Algorithm |

claim paper

Post Info
More Details (n/a)

Added	01 Nov 2010
Updated	01 Nov 2010
Type	Conference
Year	1998
Where	ACL
Authors	Claire Cardie, David R. Pierce

Comments (0)

Sciweavers

Error-Driven Pruning of Treebank Grammars for Base Noun Phrase Identification

ACL 1998 | ACL 2007 | Base Np | Base Np Identification | Simple Algorithm |

Explore & Download

Productivity Tools

Sciweavers