Sciweavers

BMCBI
2008

How to make the most of NE dictionaries in statistical NER

14 years 17 days ago
How to make the most of NE dictionaries in statistical NER
Background: When term ambiguity and variability are very high, dictionary-based Named Entity Recognition (NER) is not an ideal solution even though large-scale terminological resources are available. Many researches on statistical NER have tried to cope with these problems. However, it is not straightforward how to exploit existing and additional Named Entity (NE) dictionaries in statistical NER. Presumably, addition of NEs to an NE dictionary leads to better performance. However, in reality, the retraining of NER models is required to achieve this. We chose protein name recognition as a case study because it most suffers the problems related to heavy term variation and ambiguity. Methods: We have established a novel way to improve the NER performance by adding NEs to an NE dictionary without retraining. In our approach, first, known NEs are identified in parallel with Part-of-Speech (POS) tagging based on a general word dictionary and an NE dictionary. Then, statistical NER is traine...
Yutaka Sasaki, Yoshimasa Tsuruoka, John McNaught,
Added 09 Dec 2010
Updated 09 Dec 2010
Type Journal
Year 2008
Where BMCBI
Authors Yutaka Sasaki, Yoshimasa Tsuruoka, John McNaught, Sophia Ananiadou
Comments (0)