Sciweavers

DATAMINE
2006

Fast Distributed Outlier Detection in Mixed-Attribute Data Sets

13 years 11 months ago
Fast Distributed Outlier Detection in Mixed-Attribute Data Sets
Efficiently detecting outliers or anomalies is an important problem in many areas of science, medicine and information technology. Applications range from data cleaning to clinical diagnosis, from detecting anomalous defects in materials to fraud and intrusion detection. Over the past decade, researchers in data mining and statistics have addressed the problem of outlier detection using both parametric and non-parametric approaches in a centralized setting. However, there are several challenges that must still be addressed. First, most approaches to date have focused on detecting outliers in a continuous attribute space. However, almost all real-world data sets contain a mixture of categorical and continuous attributes. The categorical attributes are typically ignored or incorrectly modeled by existing approaches, resulting in a significant loss of information. Second, there have not been any general-purposedistributed outlier detection algorithms. Most distributed detection algorithm...
Matthew Eric Otey, Amol Ghoting, Srinivasan Partha
Added 11 Dec 2010
Updated 11 Dec 2010
Type Journal
Year 2006
Where DATAMINE
Authors Matthew Eric Otey, Amol Ghoting, Srinivasan Parthasarathy
Comments (0)