Locating and parsing bibliographic references in HTML medical articles

13 years 11 months ago

Download archive.nlm.nih.gov

The set of references that typically appear toward the end of journal articles is sometimes, though not always, a ﬁeld in bibliographic (citation) databases. But even if references do not constitute such a ﬁeld, they can be useful as a preprocessing step in the automated extraction of other bibliographic data from articles, as well as in computer-assisted indexing of articles. Automation in data extraction and indexing to minimize human labor is key to the affordable creation and maintenance of large bibliographic databases. Extracting the components of references, such as author names, article title, journal name, publication date and other entities, is therefore a valuable and sometimes necessary task. This paper describes a two-step process using statistical machine learning algorithms, to ﬁrst locate the references in HTML medical articles and then to parse them. Reference locating identiﬁes the reference section in an article and then decomposes it into individual referenc...

Jie Zou, Daniel X. Le, George R. Thoma

Real-time Traffic

Algorithms | Bibliographic | IJDAR 2010 | Large Bibliographic Databases |

claim paper

Post Info
More Details (n/a)

Added	27 Jan 2011
Updated	27 Jan 2011
Type	Journal
Year	2010
Where	IJDAR
Authors	Jie Zou, Daniel X. Le, George R. Thoma

Comments (0)

Sciweavers

Locating and parsing bibliographic references in HTML medical articles

Algorithms | Bibliographic | IJDAR 2010 | Large Bibliographic Databases |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers