The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper,...
Abstract. One of the de ning challenges for the KDD research community is to enable inductive learning algorithms to mine very large databases. This paper summarizes, categorizes, ...
Many KDD systems incorporate an implicit or explicit preference for simpler models, but this use of "Occam's razor" has been strongly criticized by several authors (...
Discovery of frequent patterns has been studied in a variety of data mining settings. In its simplest form, known from association rule mining, the task is to discover all frequent...
To find the optimal branching of a nominal attribute at a node in an L-ary decision tree, one is often forced to search over all possible L-ary partitions for the one that yields t...
Don Coppersmith, Se June Hong, Jonathan R. M. Hosk...
Abstract. We describe a scalable parallel implementation of the self organizing map (SOM) suitable for datamining applications involving clustering or segmentation against large da...
Richard D. Lawrence, George S. Almasi, Holly E. Ru...
The tremendous number of rules generated in the mining process makes it necessary for any good data mining system to provide for powerful query primitives to post-process the gener...