Disease occurs due to aberrant modulation of biological pathways. Identification of activated gene pathways from gene expression data is an important problem. In this work, we develop a framework identifying activated pathways that incorporates cellular location of the gene, using gene ontology databases, in addition to gene expression data. This information is combined using Laplacian Eigenmaps to coembed these data into a low dimensional manifold. Modelbased clustering is then performed to identify biologically relevant activated pathways in the gene expression data. We illustrate the effectiveness of our manifold embedding approach for the problem of extracting immune system pathways from a macrophage gene expression dataset [11].
Arvind Rao, Alfred O. Hero