Wereporthere ona studyof interannotatoragreementin the coreferencetask as defined by the MessageUnderstanding Conference(MUC-6and MUC-7).Basedon feedback from annotators, weclarified and simplified the annotation specification. Wethen performed an analysis of disagreementamongseveral annotators, concludingthat only 16%of the disagreements represented genuine disagreementaboutcoreference;the remainderof the cases were mostly typographical errors or omissions, easily reconciled.Initially, wemeasuredinterannotatoragreement in the low80’s for precisionandrecall. Totry to improve uponthis, weran several experiments. In our final experiment, weseparated the tagging of candidate noun phrasesfromthe linking of actual coreferringexpressions. This methodshowspromise -- interannotator agreement climbedto the low 90s -- but it needs moreextensive validation. Theseresults positionthe researchcommunityto broadenthe coreferencetask to multiple languages, and possiblyto differentkindsof coreference...
Lynette Hirschman, Patricia Robinson, John D. Burg