Sciweavers

BMCBI
2008

Normalizing biomedical terms by minimizing ambiguity and variability

13 years 11 months ago
Normalizing biomedical terms by minimizing ambiguity and variability
Background: One of the difficulties in mapping biomedical named entities, e.g. genes, proteins, chemicals and diseases, to their concept identifiers stems from the potential variability of the terms. Soft string matching is a possible solution to the problem, but its inherent heavy computational cost discourages its use when the dictionaries are large or when real time processing is required. A less computationally demanding approach is to normalize the terms by using heuristic rules, which enables us to look up a dictionary in a constant time regardless of its size. The development of good heuristic rules, however, requires extensive knowledge of the terminology in question and thus is the bottleneck of the normalization approach. Results: We present a novel framework for discovering a list of normalization rules from a dictionary in a fully automated manner. The rules are discovered in such a way that they minimize the ambiguity and variability of the terms in the dictionary. We eva...
Yoshimasa Tsuruoka, John McNaught, Sophia Ananiado
Added 09 Dec 2010
Updated 09 Dec 2010
Type Journal
Year 2008
Where BMCBI
Authors Yoshimasa Tsuruoka, John McNaught, Sophia Ananiadou
Comments (0)