Systems biologic studies of gene and protein interaction networks have found that these networks are comprised of `modules' (groups of tightly interconnected nodes). Module identification is an essential step towards understanding the whole network architecture. Here we will focus on module identification methods that are based on using a node dissimilarity measure in conjunction with a clustering method. More specifically, we introduce a general class of node dissimilarity measures based on the notion of `topological' overlap, which has been found to be biologically meaningful in several applications. The resulting generalized topological overlap measure (GTOM) generalizes the standard topological overlap measure (TOM) introduced by (2002). Specifically, the m-th order version of this family is constructed by (i) counting the number of m-step neighbors that a pair of nodes share and (ii) normalizing it to take a value between 0 and
Andy M. Yip, Steve Horvath