Existing data mining techniques mostly focus on finding global patterns and lack the ability to systematically discover regional patterns. Most relationships in spatial datasets are regional; therefore there is a great need to extract regional knowledge from spatial datasets. This paper proposes a novel framework to discover interesting regions characterized by “strong regional correlation relationships” between attributes, and methods to analyze differences and similarities between regions. The framework employs a twophase approach: it first discovers regions by employing clustering algorithms that maximize a PCA-based fitness function and then applies post processing techniques to explain underlying regional structures and correlation patterns. Additionally, a new similarity measure that assesses the structural similarity of regions based on correlation sets is introduced. We evaluate our framework in a case study which centers on finding correlations between arsenic pollution an...
Oner Ulvi Celepcikay, Christoph F. Eick, Carlos Or