Accurate dependency recovery has recently been reported for a number of wide-coverage statistical parsers using Combinatory Categorial Grammar (CCG). However, overall figures give no indication of a parser's performance on specific constructions, nor how suitable a parser is for specific applications. In this paper we give a detailed evaluation of a CCG parser on object extraction dependencies found in WSJ text. We also show how the parser can be used to parse questions for Question Answering. The accuracy of the original parser on questions is very poor, and we propose a novel technique for porting the parser to a new domain, by creating new labelled data at the lexical category level only. Using a supertagger to assign categories to words, trained on the new data, leads to a dramatic increase in question parsing accuracy.
Stephen Clark, Mark Steedman, James R. Curran