Mean shift is a popular approach for data clustering, however, the high computational complexity of the mean shift procedure limits its practical applications in high dimensional and large data set clustering. In this paper, we propose an efficient method that allows mean shift clustering performed on large data set containing tens of millions of points at interactive rate. The key in our method is a new scheme for approximating mean shift procedure using a greatly reduced feature space. This reduced feature space is adaptive clustering of the original data set, and is generated by applying adaptive KD-tree in a high-dimensional affinity space. The proposed method significantly reduces the computational cost while obtaining almost the same clustering results as the standard mean shift procedure. We present several kinds of data clustering applications to illustrate the efficiency of the proposed method, including image and video segmentation, static geometry model and time-varying seq...