We propose a novel method, based on concepts from expander graphs, to sample communities in networks. We show that our sampling method, unlike previous techniques, produces subgraphs representative of community structure in the original network. These generated subgraphs may be viewed as stratified samples in that they consist of members from most or all communities in the network. Using samples produced by our method, we show that the problem of community detection may be recast into a case of statistical relational learning. We empirically evaluate our approach against several real-world datasets and demonstrate that our sampling method can effectively be used to infer and approximate community affiliation in the larger network. Categories and Subject Descriptors H.2.8 [Information Systems]: Database Applications— Data Mining General Terms Algorithms; Experimentation Keywords sampling, social network analysis, community detection, complex networks, graphs, clustering
Arun S. Maiya, Tanya Y. Berger-Wolf