Sciweavers

PAKDD
2009
ACM

A New Local Distance-Based Outlier Detection Approach for Scattered Real-World Data

14 years 5 months ago
A New Local Distance-Based Outlier Detection Approach for Scattered Real-World Data
Detecting outliers which are grossly different from or inconsistent with the remaining dataset is a major challenge in real-world KDD applications. Existing outlier detection methods are ineffective on scattered real-world datasets due to implicit data patterns and parameter setting issues. We define a novel Local Distance-based Outlier Factor (LDOF) to measure the outlier-ness of objects in scattered datasets which addresses these issues. LDOF uses the relative location of an object to its neighbours to determine the degree to which the object deviates from its neighbourhood. Properties of LDOF are theoretically analysed including LDOF’s lower bound and its false-detection probability, as well as parameter settings. In order to facilitate parameter settings in real-world applications, we employ a top-n technique in our outlier detection approach, where only the objects with the highest LDOF values are regarded as outliers. Compared to conventional approaches (such as top-n KNN a...
Ke Zhang, Marcus Hutter, Huidong Jin
Added 26 Jul 2010
Updated 26 Jul 2010
Type Conference
Year 2009
Where PAKDD
Authors Ke Zhang, Marcus Hutter, Huidong Jin
Comments (0)