Achieving high classification accuracy is a major challenge in the diagnosis of cancer types based on gene expression profiles. These profiles are notoriously noisy in that a large number of genes might be irrelevant to or weakly associated with disease phenotypes such as tumors. Assigning different weights to genes could decrease or diminish the influences of those "noisy" signals, and thereby improve classification accuracy. We propose an intuitive and simple approach to cancer classification with feature weighting. Our strangeness-based feature weighting method learns weights for different genes based on their classification performance. Those genes with large weights can be used as discriminative genes. We demonstrate that our implementation of kNN classifier achieved high classification accuracy on two benchmark cancer data sets. In the case of relatively low accuracy, the proposed method could be used as a feature filter. With combined feature weighting and AdaBoost, w...
Haifeng Shao, Bei Yu, Joseph H. Nadeau