Based on the correlation between expression and ontologydriven gene similarity, we incorporate functional annotations into gene expression clustering validation. A probabilistic framework is proposed to accommodate incomplete annotations, after establishing a new term-term distance measure based on graph theory. Comprehensive evaluations are performed on six clustering algorithms. This study is the first to explore a robust quantitative functional relationship between clusters of genes. Such indices assess clustering quality in terms of consistency of annotation information and serve as new tools for combining biological knowledge with experimental data.