Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

185

KDD
2003
ACM

122views Data Mining» more KDD 2003»

Understanding captions in biomedical publications

16 years 7 months ago

Understanding captions in biomedical publications

Download murphylab.web.cmu.edu

From the standpoint of the automated extraction of scientific knowledge, an important but little-studied part of scientific publications are the figures and accompanying captions. Captions are dense in information, but also contain many extra-grammatical constructs, making them awkward to process with standard information extraction methods. We propose a scheme for "understanding" captions in biomedical publications by extracting and classifying "image pointers" (references to the accompanying image). We evaluate a number of automated methods for this task, including hand-coded methods, methods based on existing learning techniques, and methods based on novel learning techniques. The best of these methods leads to a usefully accurate tool for caption-understanding, with both recall and precision in excess of 94% on the most important single class in a combined extraction/classification task. General Terms Information extraction Keywords Information extraction, bioi...

William W. Cohen, Richard C. Wang, Robert F. Murph

Real-time Traffic

Data Mining | Hand-coded Methods | Information Extraction Keywords | Information Extraction Methods | KDD 2003 |

claim paper

Related Content

» Probabilistic models for topic learning from images and captions in online biomedical lite...

» GenreBased Search through Biomedical Images

» Structured literature image finder Parsing text and figures in biomedical literature

» Figure content analysis for improved biomedical article retrieval

» Biomedical article retrieval using multimodal features and image annotations in regionbase...

» Visualization and Language Processing for Supporting Analysis across the Biomedical Litera...

» A coherent graphbased semantic clustering and summarization approach for biomedical litera...

» PubFocus semantic MEDLINEPubMed citations analytics through integration of controlled biom...

» Biomedical Article Classification Using an AgentBased Model of TCell CrossRegulation

Post Info
More Details (n/a)

Added	30 Nov 2009
Updated	30 Nov 2009
Type	Conference
Year	2003
Where	KDD
Authors	William W. Cohen, Richard C. Wang, Robert F. Murphy

Comments (0)