Population based real-life datasets often contain smaller clusters of unusual sub-populations. While these clusters, called `hot spots', are small and sparse, they are usually of special interest to an analyst. In this paper we introduce a visual drill-down SelfOrganizing Map (SOM)-based approach to explore such hot spots characteristics in real-life datasets. Iterative clustering algorithms (such as k-means) and SOM are not designed to show these small and sparse clusters in detail. The feasibility of our approach is demonstrated using a large real life dataset from the Australian Taxation Office.
Denny, Graham J. Williams, Peter Christen