InformationExtraction (IE) systemstypically rely onextraction patterns encoding domain-specific knowledge. Whenmatched against natural languagetexts, these patterns recognize with high accuracy information relevant to the extraction task. Adapting an IE system to a newextraction scenarioentails devising a newcollection of extraction patterns - a time-consumingand expensive process. To overcomethis obstacle, wehave implemented in CICERO,our IE system, a pattern acquisition mechanismthat combineslexicosemantic knowledgeavailable from WordNetwith syntactic informationcollected fromtraining corpora. Theopen-domainnature of the knowledge encodedin WordNetgrants portability of our approach across multiple extraction domains.
Sanda M. Harabagiu, Mihai Surdeanu, Paul Morarescu