The Gene Ontology (GO) is a controlled vocabulary of terms to describe protein functions. It also includes a hierarchical description of the relationships among the terms in the form of a directed acyclic graph (DAG). Several systems have been developed that employ pattern recognition to assign gene function, using a variety of features, including sequence similarity, presence of protein functional domains and gene expression patterns, but most of these approaches have not considered the hierarchical structure of the GO. The DAG represents the functional relationships between the GO terms, thus it should be an important component of an automated annotation system. We propose a Bayesian, multi-label classifier that incorporates the relationships among GO terms found in the GO DAG. A comparative analysis of our method to other previously described annotation systems shows that our method provides improved annotation accuracy when the performance of individual GO terms are compared. Mor...
Jaehee Jung, Michael R. Thon