Currently, large-scale projects are underway to perform whole genome disease association studies. Such studies involve the genotyping of hundreds of thousands of SNP markers. One of the main obstacles in performing such studies is that the underlying population substructure could artificially inflate the p-values, thereby generating a lot of false positives. Although existing tools cope well with very distinct subpopulations, closely related population groups remain a major cause of concern. In this work, we present a graph based approach to detect population substructure.Our method is based on a distance measure between individuals. We show analytically that when the allele frequency differences between the two populations are large enough (in the l2-norm sense), our algorithm is guaranteed to find the correct classification of individuals to sub-populations. We demonstrate the empirical performance of our algorithms on simulated and real data and compare it against existing methods, ...