Sciweavers

KDD
1998
ACM

Similarity of Attributes by External Probes

14 years 3 months ago
Similarity of Attributes by External Probes
In data mining, similarity or distance between attributes is one of the central notions. Such a notion can be used to build attribute hierarchies etc. Similarity metrics can be user-de ned, but an important problem is de ning similarity on the basis of data. Several methods based on statistical techniques exist. For dening the similarity between two attributes A and B they typically consider only the values of A and B, not the other attributes. We describe how a similarity notion between attributes can be de ned by considering the values of other attributes. The basic idea is that in a 0/1 relation r, two attributes A and B are similar if the subrelations A=1(r) and B=1(r) are similar. Similarity between the two relations is de ned by considering the marginal frequencies of a selected subset of other attributes. We show that the framework produces natural notions of similarity. Empirical results on the Reuters-21578 document dataset show, for example, how natural classi cations for co...
Gautam Das, Heikki Mannila, Pirjo Ronkainen
Added 06 Aug 2010
Updated 06 Aug 2010
Type Conference
Year 1998
Where KDD
Authors Gautam Das, Heikki Mannila, Pirjo Ronkainen
Comments (0)