An important problem in data mining is detecting changes in large data sets. Although there are a variety of change detection algorithms that have been developed, in practice it can be a problem to scale these algorithms to large data sets due to the heterogeneity of the data. In this paper, we describe a case study involving payment card data in which we built and monitored a separate change detection model for each cell in a multi-dimensional data cube. We describe a system that has been in operation for the past two years that builds and monitors over 15,000 separate baseline models and the process that is used for generating and investigating alerts using these baselines. Categories and Subject Descriptors: G.3 [Probability and Statistics]: Statistical computing, statistical software; I.5.1 [Models]: Statistical Models General Terms: Algorithms
Chris Curry, Robert L. Grossman, David Locke, Stev