Serial analysis of gene expression (SAGE) is one of the most powerful tools for global gene expression profiling. It has led to several biological discoveries and biomedical applications, such as the prediction of new gene functions and the identification of biomarkers in human cancer research. Clustering techniques have become fundamental approaches in these applications. This paper reviews relevant clustering techniques specifically designed for this type of data. It places an emphasis on current limitations and opportunities in this area for supporting biologically-meaningful data mining and visualisation. Background Serial analysis of gene expression (SAGE) [1] is one of the most powerful, high-throughput tools available for global gene expression profiling at mRNA level. It allows quantitative, simultaneous analysis of thousands of transcript profile in a cell or tissue under specific biological conditions without requiring prior, complete functional knowledge of the genes to be ...