This paper reports on Lymba Corporation’s (a spinoff of Language Computer Corporation) participation in the TREC 2007 Question Answering track. An overview of the PowerAnswer 4 question answering system and a discussion of new features added to meet the challenges of this year’s evaluation are detailed. Special attention was given to methods for incorporating blogs into the searchable collection, methods for improving answer precision, both statistical and knowledge driven, new mechanisms for recognizing named entities, events, and time expressions, and updated pattern-driven approaches to answer definition questions. Lymba’s results in the evaluation are presented at the end of the paper. 1 Innovations for TREC 2007 New to TREC this year was a 175 GB collection of blog entries and an updated collection of 2.5 GB newswire articles. To meet the challenge of extracting answers from blogs PowerAnswer was prepared to search and process a collection of noisy data. For this reason me...
Dan I. Moldovan, Christine Clark, Moldovan Bowden