—Slepian-Wolf coding is a promising distributed source coding technique that can completely remove the data redundancy caused by the spatially correlated observations in wireless sensor networks (WSNs). In this paper, we study the major problems in applying Slepian-Wolf coding for data aggregation in cluster-based WSNs with an objective to optimize data compression so that the total amount of data in the whole network is minimized. We first consider the clustered Slepian-Wolf coding problem, which aims at selecting a set of disjoint potential clusters to cover the whole network such that the global compression gain of Slepian-Wolf coding is maximized. To solve this problem, a distributed optimalcompression clustering protocol (DOC2 ) is proposed. Under the optimal cluster hierarchy constructed by DOC2 , we then consider the optimal intra-cluster rate allocation problem and present an approximation algorithm that can find an optimal rate allocation within each cluster to minimize the ...