Tri-plots: scalable tools for multidimensional data mining

16 years 7 months ago

Download www.informedia.cs.cmu.edu

We focus on the problem of finding patterns across two large, multidimensional datasets. For example, given feature vectors of healthy and of non-healthy patients, we want to answer the following questions: Are the two clouds of points separable? What is the smallest/largest pair-wise distance across the two datasets? Which of the two clouds does a new point (feature vector) come from? We propose a new tool, the tri-plot, and its generalization, the pq-plot, which help us answer the above questions. We provide a set of rules on how to interpret a tri-plot, and we apply these rules on synthetic and real datasets. We also show how to use our tool for classification, when traditional methods (nearest neighbor, classification trees) may fail.

Agma J. M. Traina, Caetano Traina Jr., Spiros Papa

Real-time Traffic

Data Mining | KDD 2001 | Multidimensional Datasets | Real Datasets | Smallest/largest Pair-wise Distance |

claim paper

» ANF a fast and scalable tool for data mining in massive graphs

» Tools for Data Warehouse Quality

» NetCube A Scalable Tool for Fast Data Mining and Compression

» Mining MultiDimensional Constrained Gradients in Data Cubes

» Design and analysis of a multidimensional data sampling service for large scale data analy...

» High Performance Data Mining Using Data Cubes on Parallel Computers

» Support feature machine for classification of abnormal brain activity

Post Info
More Details (n/a)

Added	30 Nov 2009
Updated	30 Nov 2009
Type	Conference
Year	2001
Where	KDD
Authors	Agma J. M. Traina, Caetano Traina Jr., Spiros Papadimitriou, Christos Faloutsos

Comments (0)

Sciweavers

Tri-plots: scalable tools for multidimensional data mining

Data Mining | KDD 2001 | Multidimensional Datasets | Real Datasets | Smallest/largest Pair-wise Distance |

Explore & Download

Productivity Tools

Sciweavers