— Recent work has revealed a close connection between certain information theoretic divergence measures and properties of Mercer kernel feature spaces. Specifically, it has been proposed that an information theoretic measure may be used as a cost function for clustering in a kernel space, approximated by the spectral properties of the Laplacian matrix. In this paper we extend this result to other kernel matrices. We develop an algorithm for the actual clustering which is based on comparing angles between data points, and demonstrate that the proposed method performs equally good as a state-of-the art spectral clustering method. We point out some drawbacks of spectral clustering related to outliers, and suggest measures to be taken.
Robert Jenssen, Deniz Erdogmus, Jose C. Principe