Sciweavers

BMCBI
2010

LINNAEUS: A species name identification system for biomedical literature

13 years 11 months ago
LINNAEUS: A species name identification system for biomedical literature
Background: The task of recognizing and identifying species names in biomedical literature has recently been regarded as critical for a number of applications in text and data mining, including gene name recognition, species-specific document retrieval, and semantic enrichment of biomedical articles. Results: In this paper we describe an open-source species name recognition and normalization software system, LINNAEUS, and evaluate its performance relative to several automatically generated biomedical corpora, as well as a novel corpus of full-text documents manually annotated for species mentions. LINNAEUS uses a dictionary-based approach (implemented as an efficient deterministic finite-state automaton) to identify species names and a set of heuristics to resolve ambiguous mentions. When compared against our manually annotated corpus, LINNAEUS performs with 94% recall and 97% precision at the mention level, and 98% recall and 90% precision at the document level. Our system successful...
Martin Gerner, Goran Nenadic, Casey M. Bergman
Added 08 Dec 2010
Updated 08 Dec 2010
Type Journal
Year 2010
Where BMCBI
Authors Martin Gerner, Goran Nenadic, Casey M. Bergman
Comments (0)