Study of large and highly stratified population datasets by combining iterative pruning principal component analysis and STRUCTU