Background: Biological data have traditionally been stored and made publicly available through a variety of on-line databases, whereas biological knowledge has traditionally been found in the printed literature. With journals now online and providing an increasing amount of open access content, often free of copyright restriction, this distinction between database and literature is blurring. To exploit this opportunity we present the integration of open access literature with the RCSB Protein Data Bank (PDB). Results: BioLit provides an enhanced view of articles with markup of semantic data and links to biological databases, based on the content of the article. For example, words matching to existing biological ontologies are highlighted and database identifiers are linked to their database of origin. Among other functions, it identifies PDB IDs that are mentioned in the open access literature, by parsing the full text for all research articles in PubMed Central (PMC) and exposing the...
Andreas Prlic, Marco A. Martinez, Dimitris Dimitro