QuestionBank: Creating a Corpus of Parse-Annotated Questions

15 years 4 months ago

Download acl.ldc.upenn.edu

This paper describes the development of QuestionBank, a corpus of 4000 parseannotated questions for (i) use in training parsers employed in QA, and (ii) evaluation of question parsing. We present a series of experiments to investigate the effectiveness of QuestionBank as both an exclusive and supplementary training resource for a state-of-the-art parser in parsing both question and non-question test sets. We introduce a new method for recovering empty nodes and their antecedents (capturing long distance dependencies) from parser output in CFG trees using LFG f-structure reentrancies. Our main findings are (i) using QuestionBank training data improves parser performance to 89.75% labelled bracketing f-score, an increase of almost 11% over the baseline; (ii) back-testing experiments on nonquestion data (Penn-II WSJ Section 23) shows that the retrained parser does not suffer a performance drop on non-question material; (iii) ablation experiments show that the size of training material pr...

John Judge, Aoife Cahill, Josef van Genabith

Real-time Traffic

ACL 2006 | ACL 2007 | Data Improves Parser | Long Distance Dependencies | Parser |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2006
Where	ACL
Authors	John Judge, Aoife Cahill, Josef van Genabith

Comments (0)

Sciweavers

QuestionBank: Creating a Corpus of Parse-Annotated Questions

ACL 2006 | ACL 2007 | Data Improves Parser | Long Distance Dependencies | Parser |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers