In semi-supervised classification boosting, a similarity measure is demanded in order to measure the distance between samples (both labeled and unlabeled). However, most of the existing methods employed a simple metric, such as Euclidian distance, which may not be able to truly reflect the actual similarity/distance. This paper presents a novel similarity learning method based on the geodesic distance. It incorporates the manifold, margin and the density information of the data which is important in semi-supervised classification. The proposed similarity measure is then applied to a semisupervised multi-class boosting (SSMB) algorithm. In turn, the three semi-supervised assumptions, namely smoothness, low density separation and manifold assumption, are all satisfied. We evaluate the proposed method on UCI databases. Experimental results show that the SSMB algorithm with proposed similarity measure outperforms the SSMB algorithm with Euclidian distance.
Q. Y. Wang, Pong Chi Yuen, Guo-Can Feng