Abstract-- Using the kernel trick idea and the kernels as features idea, we can construct two kinds of nonlinear feature spaces, where linear feature extraction algorithms can be employed to extract nonlinear features. In this paper, we study the relationship between the two kernel ideas applied to certain feature extraction algorithms such as LDA, PCA and CCA. We provide rigorous theoretical analysis and show that they are equivalent up to different scaling on each features. These results provide a better understanding of the kernel method.