Generation of syntactically correct and unambiguous names for proteins is a challenging, yet vital task for functional annotation processes. Proteins are often named based on homo...
Johannes Goll, Robert Montgomery, Lauren M. Brinka...
A prerequisite for all higher level information extraction tasks is the identication of unknown names in text. Today, when large corpora can consist of billions of words, it is of...
KEYnet is a database where gene and protein names are hierarchically structured. Particular care has been devoted to the search and organisation of synonyms. The structuring is ba...
Gene and protein names follow few, if any, true naming conventions and are subject to great variation in different occurrences of the same name. This gives rise to two important p...
We first analyzed protein names using various dictionaries and databases and found five problems with protein names; i.e., the treatment of special characters, the treatment of hom...
This paper proposes a method for identifying protein names in biomedical texts with an emphasis on detecting protein name boundaries. We use a probabilistic model which exploits s...
Text mining techniques have been proposed for extracting protein names and their interactions. First, we have made improvements on existing methods for handling single word protein...
Kiho Hong, Junhyung Park, Jihoon Yang, Sungyong Pa...
The huge volumes of biomedical texts available online drives the increasing need for automated techniques to analyze and extract knowledge from these repositories of information. ...