Robust talking face video verification using joint factor analysis and sparse representation on GMM mean shifted supervectors