Sciweavers

PRL
2010

Mining outliers with faster cutoff update and space utilization

13 years 7 months ago
Mining outliers with faster cutoff update and space utilization
It is desirable to find unusual data objects by Ramaswamy et al's distance-based outlier definition because only a metric distance function between two objects is required. It does not need any neighborhood distance threshold required by many existing algorithms based on the definition of Knorr and Ng. Bay and Schwabacher proposed an efficient algorithm ORCA, which can give near linear time performance, for this task. To further reduce the running time, we propose in this paper two algorithms RC and RS using the following two techniques respectively: (i) faster cutoff update, and (ii) space utilization after pruning. We tested RC, RS and RCS (a hybrid approach combining both RC and RS) on several large and high-dimensional real data sets with millions of
Chi-Cheong Szeto, Edward Hung
Added 20 May 2011
Updated 20 May 2011
Type Journal
Year 2010
Where PRL
Authors Chi-Cheong Szeto, Edward Hung
Comments (0)