Huge amounts of data are available in large-scale networks of autonomous data sources dispersed over a wide area. Data mining is an essential technology for obtaining hidden and valuable knowledge from these networked data sources. In this paper, we investigate clustering, one of the most important data mining tasks, in one of such networked computing environments, i.e., peer-to-peer (P2P) systems. The lack of a central control and the sheer large size of P2P systems make the existing clustering techniques not applicable here. We propose a fully distributed clustering algorithm, called Peer dENsity-based cluStering (PENS), which overcomes the challenge raised in performing clustering in peer-to-peer environments, i.e., cluster assembly. The main idea of PENS is hierarchical cluster assembly, which enables peers to collaborate in forming a global clustering model without requiring a central control or message flooding. The complexity analysis of the algorithm demonstrates that PENS ca...