We consider the task of performing anomaly detection in highly noisy multivariate data. In many applications involving real-valued time-series data, such as physical sensor data and economic metrics, discovering changes and anomalies in the way variables depend on one another is of particular importance. Our goal is to robustly compute the “correlation anomaly” score of each variable by comparing the test data with reference data, even when some of the variables are highly correlated (and thus collinearity exists). To remove seeming dependencies introduced by noise, we focus on the most significant dependencies for each variable. We perform this “neighborhood selection” in an adaptive manner by fitting a sparse graphical Gaussian model. Instead of traditional covariance selection procedures, we solve this problem as maximum likelihood estimation of the precision matrix (inverse covariance matrix) under the L1 penalty. Then the anomaly score for each variable is computed by e...
Tsuyoshi Idé, Aurelie C. Lozano, Naoki Abe,