The current methods of publishing chemical information in bioscience articles are analysed. Using 3 papers as use-cases, it is shown that conventional methods using human procedures, including cut-and-paste are time-consuming and introduce errors. The meaning of chemical terms and the identity of compounds is often ambiguous. valuable experimental data such as spectra and computational results are almost always omitted. We describe an Open XML architecture at proof-of-concept which addresses these concerns. Compounds are identified through explicit connection tables or links to persistent Open resources such as PubChem. It is argued that if publishers adopt these tools and protocols, then the quality and quantity of chemical information available to bioscientists will increase and the authors, publishers and readers will find the process cost-effective.
Peter Murray-Rust, John B. O. Mitchell, Henry S. R