ProtChew: Automatic Extraction of Protein Names from Biomedical Literature

15 years 7 months ago

Download www-tsujii.is.s.u-tokyo.ac.jp

With the increasing amount of biomedical literature, there is a need for automatic extraction of information to support biomedical researchers. Due to incomplete biomedical information databases, the extraction is not straightforward using dictionaries, and several approaches using contextual rules and machine learning have previously been proposed. Our work is inspired by the previous approaches, but is novel in the sense that it is fully automatic and doesn’t rely on expert tagged corpora. The main ideas are 1) unigram tagging of corpora using known protein names for training examples for the protein name extraction classiﬁer and 2) tight positive and negative examples by having protein-related words as negative examples and protein names/synonyms as positive examples. We present preliminary results on Medline abstracts about gastrin, further work will be on testing the approach on BioCreative benchmark data sets.

Amund Tveit, Rune Sætre, Astrid Lægrei

Real-time Traffic

Biomedical | Database | ICDE 2005 | Incomplete Biomedical Information | Negative Examples |

claim paper

» Structured literature image finder Parsing text and figures in biomedical literature

» Sentence Simplification Aids ProteinProtein Interaction Extraction

» A sentence sliding window approach to extract protein annotations from biomedical articles

» LINNAEUS A species name identification system for biomedical literature

» Protein Association Discovery in Biomedical Literature

» Generating gene summaries from biomedical literature A study of semistructured summarizati...

» A method for automatically extracting infectious diseaserelated primers and probes from th...

» Extracting ProteinProtein Interactions from MEDLINE using the Hidden Vector State model

Post Info
More Details (n/a)

Added	24 Jun 2010
Updated	24 Jun 2010
Type	Conference
Year	2005
Where	ICDE
Authors	Amund Tveit, Rune Sætre, Astrid Lægreid, Tonje Strommen Steigedal

Comments (0)

Sciweavers

ProtChew: Automatic Extraction of Protein Names from Biomedical Literature

Biomedical | Database | ICDE 2005 | Incomplete Biomedical Information | Negative Examples |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers